8.1 Compiling HPF Programs

The Digital Fortran 90 compiler can be used to produce either standard applications that execute on a single processor (serial execution), or parallel applications that execute on multiple processors using PSE. Parallel applications are produced by using the Digital Fortran 90 compiler with the -wsf option.


Note
In order to achieve parallel execution, Fortran programs must be written with HPF (High Performance Fortran) directives and without reliance on sequence association. For information on HPF, see Chapters 6 and 7. The HPF Tutorial is contained in Chapters 2, 3, 4, and 5.

8.1.1 Compile-Time Options for High Performance Fortran Programs

This section describes the Digital Fortran 90 command-line options that are specifically relevant to parallel HPF programs.

8.1.1.1 -wsf [nn] Option - Compile for Parallel Execution

Specifying the -wsf option indicates that the program should be compiled to execute in parallel on multiple processors.

HPF directives in programs affect program execution only if the -wsf option is specified at compile time. If the -wsf option is omitted, HPF directives are checked for syntax, but otherwise ignored.

Specifying -wsf with a number as an argument optimizes the executable for that number of processors. For example, specifying -wsf 4 generates a program for 4 processors. Specifying -wsf without an argument produces a more general program that can run on any arbitrary number of processors. Using a numerical argument results in superior application performance.

For best performance, do not specify an argument to -wsf that is greater than the number of CPUs that will be available at run time. Relying upon the PSE -virtual run-time option to simulate a PSE cluster larger than the number of available processors usually causes degradation of application performance.

Any number of processors is allowed. However, performance may be degraded in some cases if the number of processors is not a power of two.

The -nearest_neighbor , -pprof and -show hpf options can be used only when - wsf is specified.

When parallel programs are compiled and linked as separate steps (see the documentaton of the -c option in the Fortran User Manual), the -wsf option must be used with the f90 command both at compile time and link time. If -wsf is used with a numerical argument, the same argument must be used at compile time and link time.

For More Information:

8.1.1.2 -assume nozsize Option - Omit Zero-Sized Array Checking

An array (or array section) is zero sized when the extent of any of its dimensions takes the value zero or less than zero. When the -wsf option is specified, the compiler is required to insert a series of checks to guard against irregularities (such as division by zero) in the generated code that zero-sized data objects can cause. Depending upon the particular application, these checks can cause noticeable (or even major) degradation of performance.

The -assume nozsize option causes the compiler to omit these checks for zero-sized arrays and array sections. This option is automatically selected when the -fast option is selected.

The -assume nozsize option may not be used when a program references any zero-sized arrays or array sections. An executable produced with the -assume nozsize option may fail or produce incorrect results when it references any zero- sized arrays or array sections.

You can insert a run-time check into your program to ensure that a given line is not executed if an array or array section referenced there is zero sized. This will allow you to specify -assume nozsize even when there is a possibility of a zero-sized array reference in that line.

For More Information:

8.1.1.3 -fast Option - Set Options to Improve Run-Time Performance

The -fast option activates options that improve run- time performance. Among the options set by the -fast option is the -assume nozsize option (a full list of the options set by -fast can be found in the Fortran User Manual and in the Fortran 90(1) manpage). This means that the restrictions that apply to the -assume nozsize option also apply to the -fast option.

For More Information:

8.1.1.4 -nearest_neighbor [nn] and -nonearest_neighbor Options

The compiler's nearest- neighbor optimization is enabled by default. The -nearest_neighbor option is used to modify the limit on the extra storage allocated for nearest neighbor optimization.

The -nonearest_neighbor option is used to disable nearest neighbor optimization.

The compiler automatically determines the correct shadow-edge widths on an array-by-array, dimension-by-dimension basis. You can also set shadow-edge widths manually by using the SHADOW keyword inside the DISTRIBUTE directive. This is necessary to preserve the shadow edges when nearest-neighbor arrays are passed as arguments.

The optional nn field specifies the maximum allowable shadow-edge width in order to set a limit on how much extra storage the compiler may allocate for nearest-neighbor arrays. The nearest- neighbor optimization is not performed for array dimensions needing a shadow-edge width greater than nn.

When programs are compiled with the -wsf option, the default is -nearest_neighbor 10 .

The -nonearest_neighbor option disables the nearest-neighbor optimization. It is equivalent to specifying -nearest_neighbor 0 .

For More Information:

8.1.1.5 -nowsf_main Option - for Non-Parallel Main Programs

Use the -nowsf_main option to incorporate parallel routines into non-parallel programs.

When you incorporate parallel routines into non-parallel programs, some routines must be compiled with -nowsf_main , and some should be compiled without -nowsf_main . Please refer to Table 6-2.

For More Information:

8.1.1.6 -pprof Option - Preparing for Parallel Profiling

The -pprof option prepares a parallel program for subsequent profiling with the pprof profiler.

In order to use the -pprof option, certain requirements must be met:

For more information on the -pprof option and on the pprof profiler, see Chapter 10, Parallel Profiler.

8.1.1.7 -show hpf-Show Parallelization Information

The -show hpf option sends information related to parallelization to standard error and to the listing (if one is generated with -V ). These flags are valid only if the -wsf flag is specified. You can use this information to help you tune your program for better performance.

This option has several forms:

It is usually best to try using -show hpf first. Use the others only when you need a more detailed listing.

-show can take only one argument. However, the -show flags can be combined by specifying -show multiple times. For example:

% f90 -wsf -show hpf_nearest -show hpf_comm -show hpf_punt foo.f90

For More Information:

8.1.2 Consistency of Number of Peers

When linking is done as a separate step from compiling, the Digital Fortran 90 compiler requires all objects to be compiled with the same argument to the -wsf option's optional [nn] field. If objects were compiled for an inconsistent number of processors, the following error message occurs:

Unresolved:
_hpf_compiled_for_nn_nodes_

If you do not know which object was compiled for the wrong number of processors, the incorrectly compiled object can be identified using the UNIX nm utility.

For More Information: