Take the following steps to port applications from Fortran 77 to
Digital Fortran 90 with HPF for parallel
execution:
- Change compilers to Digital Fortran
90.
- Recompile the code as is using the Digital Fortran 90 compiler in scalar mode
(without using the
-wsf option). Digital Fortran 90 supports a substantial
number, but not all possible, extensions to Fortran 77. This
identifies the worst nonstandard offenders, if any, so you can
remove them from the scalar code base.
- Test and validate that the scalar code produces the
correct answers.
- Recompile using the
-wsf option (but without
changing the source code), test, and validate for a Farm (of
any size). You should expect no performance improvement; this
simply validates that there are no anomalies between scalar
and parallel.
- Find the "hot spots"
Identify the routines that use the most time. Do this by
profiling the scalar code compiled without -wsf ;
use the -p or -pg profiling options.
- Fortran 90-ize the hot spots.
- Convert global COMMON data used by these routines
into MODULE data. This should be straightforward if the
data consists of "global objects", but hard if the data is
"storage" that is heavily equivalenced. This involves changes
in non-hotspot routines that use the same data. A first step
is simply to place the COMMON statements into MODULEs and
replace INCLUDEs by USEs.
- Eliminate the use of Fortran 77 sequence association,
linear memory assumptions, pointers that are addresses (such
as Cray pointers), array element order assumptions (column-
wise storage), and so on.
- Change calling sequences to pass array arguments by
assumed shape. This involves changes on the caller and callee
sides. The routines must have explicit interfaces, most easily
provided by putting the routines in a module. Actual arguments
that look like array elements and really are array sections
must be replaced by array sections. Calling sequences can
be cleaned up by eliminating the passing of array sizes,
for example, and using inquiry intrinsic functions (such as
LBOUND, UBOUND) in the callee instead.
- Replace intensive computation, such as nested DO loops,
with Fortran 90 array assignments or HPF FORALL constructs.
- Recompile, test, and validate that the scalar code
produces the correct answers.
- Recompile using the
-wsf
and -show hpf
options, test, and validate for a PSE
Cluster (of any size including 1). You should expect no
performance improvement; this simply validates that there are
no anomalies between scalar and parallel.
- Data decomposition
- Analyze the usage of data in the hot spots for desired
ALIGNment locality and DISTRIBUTE across multiple processors.
Annotate code with HPF directives.
- Recompile, test, and validate that the scalar code
produces the correct answers.
- Recompile using the
-wsf and
-show HPF options, test, and validate for a
PSE Cluster. Pay particular
attention to messages produced by -show hpf ;
replace any serialized constructs identified by the
compiler with parallelizable constructs. Pay attention
to motion introduced by the compiler (identified by
-show hpf ); make sure it agrees with the motion
you expect from the HPF directives you thought you wrote.
- If the performance did not improve as expected, analyze
using the PSE profiler, modify the code, and return to step
4b.
- The rest of the code:
Repeat steps 3 and 4 for more of the code. Ideally, the structure
of the whole program should be rethought with Fortran 90 modules
in mind. Use of Fortran 90 constructs allows for a significant
improvement in the readability and maintainability of the code.