7.14 Timing

Processors are never synchronized explicitly by the compiler, except when certain intrinsics are invoked (see Section 7.10). This means, however, that if you issue calls to timing routines not included in the list of synchronized intrinsics (see Section 7.10) you may not get the results you expect. Consider the following (illegal) code:

REAL elapsed_time

CALL start_timer()

<statements being timed>

elapsed_time = stop_timer()
PRINT *, elapsed_time

In this code fragment, start_timer and stop_ timer are fictitious names for some user-written or operating-system-supplied routines other than the timing intrinsics mentioned.

The variable elapsed_time is not explicitly distributed, so it is replicated on all processors. Replication is the default distribution in parallel Digital Fortran 90 programs. The stop_timer routine, which returns its result in elapsed_time , is called on all processors. However, since elapsed_time is replicated, the print statement prints the peer 0 value. Due to the unsynchronized nature of the code, peer 0 may reach the stop_ timer call either before or after other processors have finished executing the code being timed, so the value it prints does not reflect the true elapsed time. This program is not a legal HPF program, since the stop_timer routine could return different values on different processors; it is illegal to modify a replicated value differently on different processors.

There are two problems here, first that the values stored in elapsed_time differ on different processors, and second that Peer 0 may reach the timing at a different time than the other processors.

To solve the first problem, make the timer routines EXTRINSIC(HPF_SERIAL) routines; they only execute on peer 0. To solve the second problem, force a synchronization before calling the timer routines. For example, this causes the call to stop_ timer to be delayed until all the processors finish executing the code being timed. The HPF library routine HPF_SYNCH is used for this purpose.

The following code fragment returns the desired results:

REAL elapsed_time
INTERFACE
  EXTRINSIC(HPF_SERIAL) FUNCTION start_timer()
  END FUNCTION
  REAL EXTRINSIC(HPF_SERIAL) FUNCTION stop_timer()
  END FUNCTION
END INTERFACE

CALL hpf_sync()
CALL start_timer()

 <statements being timed>

CALL hpf_sync()
elapsed_time=stop_timer()
PRINT *, elapsed_time