7.2 Converting Fortran 77 Programs to HPF

Take the following steps to port applications from Fortran 77 to Digital Fortran 90 with HPF for parallel execution:

  1. Change compilers to Digital Fortran 90.
    1. Recompile the code as is using the Digital Fortran 90 compiler in scalar mode (without using the -wsf option). Digital Fortran 90 supports a substantial number, but not all possible, extensions to Fortran 77. This identifies the worst nonstandard offenders, if any, so you can remove them from the scalar code base.
    2. Test and validate that the scalar code produces the correct answers.
    3. Recompile using the -wsf option (but without changing the source code), test, and validate for a Farm (of any size). You should expect no performance improvement; this simply validates that there are no anomalies between scalar and parallel.
  2. Find the "hot spots"

    Identify the routines that use the most time. Do this by profiling the scalar code compiled without -wsf ; use the -p or -pg profiling options.

  3. Fortran 90-ize the hot spots.
    1. Convert global COMMON data used by these routines into MODULE data. This should be straightforward if the data consists of "global objects", but hard if the data is "storage" that is heavily equivalenced. This involves changes in non-hotspot routines that use the same data. A first step is simply to place the COMMON statements into MODULEs and replace INCLUDEs by USEs.
    2. Eliminate the use of Fortran 77 sequence association, linear memory assumptions, pointers that are addresses (such as Cray pointers), array element order assumptions (column- wise storage), and so on.
    3. Change calling sequences to pass array arguments by assumed shape. This involves changes on the caller and callee sides. The routines must have explicit interfaces, most easily provided by putting the routines in a module. Actual arguments that look like array elements and really are array sections must be replaced by array sections. Calling sequences can be cleaned up by eliminating the passing of array sizes, for example, and using inquiry intrinsic functions (such as LBOUND, UBOUND) in the callee instead.
    4. Replace intensive computation, such as nested DO loops, with Fortran 90 array assignments or HPF FORALL constructs.
    5. Recompile, test, and validate that the scalar code produces the correct answers.
    6. Recompile using the -wsf and -show hpf options, test, and validate for a PSE Cluster (of any size including 1). You should expect no performance improvement; this simply validates that there are no anomalies between scalar and parallel.
  4. Data decomposition
    1. Analyze the usage of data in the hot spots for desired ALIGNment locality and DISTRIBUTE across multiple processors. Annotate code with HPF directives.
    2. Recompile, test, and validate that the scalar code produces the correct answers.
    3. Recompile using the -wsf and -show HPF options, test, and validate for a PSE Cluster. Pay particular attention to messages produced by -show hpf ; replace any serialized constructs identified by the compiler with parallelizable constructs. Pay attention to motion introduced by the compiler (identified by -show hpf ); make sure it agrees with the motion you expect from the HPF directives you thought you wrote.
    4. If the performance did not improve as expected, analyze using the PSE profiler, modify the code, and return to step 4b.
  5. The rest of the code:

    Repeat steps 3 and 4 for more of the code. Ideally, the structure of the whole program should be rethought with Fortran 90 modules in mind. Use of Fortran 90 constructs allows for a significant improvement in the readability and maintainability of the code.