Reference:
- The Australian National University CECS
Timer Overhead vs Timer Resolution
Diffrences:
- Timer overhead is the length of time it takes to call the timer function. The total time to run your program will be increased by this value multiplied by the number of times you call the timer.
- Timer resolution is the period of time below which the timer will sometimes report a value of zero. It represents the smallest period that can accurately be measured by the timer.
Assess timer by calling it twice.
For timers that operate like clocks (real time or CPU time), the differences represent the change in the value of the clock between each call. The overhead is the average of the consecutively reported differences. The lowest measured values will be integer multiples of the resolution. If the overhead is less (or finer) than the resolution, then the measured overhead may be zero.
The above used gettimeofday()
The smallest non-zero is 1, hence the resoultion of the timer is 1us.
There are several zero valuse, hence the overhead or the timer is <1us.
The above used MPI_Wtime()
MPI_Wtick gives the resolution of 1e-09 which appears to agree the MPI_Wtime() hand calculation.
Blocking Behavior
The code fails because we have both processes wishing to send at the same time. If the message is small we see non-blocking behaviour since the message can be packed into the initial buffer (internal local system) —> Hence can store and move on to the next line of execution.
For larger sizes it fails because we see a transition from a non-blocking send to a blocking send when the message becomes too large. It stalls to find the receive() from the destination while the destination stalls and waits receive() as well.
Startup Time
Once the size reaches certian point, possibility of L3 Cache missing the data may cause decrease in the Bw; bw = 2.0 * length of words * sizeof(int) / time / 1e9 = GB/S
From the above results:
Latency = 4.42e-07s / 2 = 0.22 us ⇒ Ping Pong of empty messages average time.
Peak bandwidth ~= 1 GB/s