8.3 PSE Application Execution Model

The following sections describe the processing that occurs when a parallel HPF application runs under PSE. The following topics are discussed:

Parallel applications execute in Single Program, Multiple Data (SPMD) mode using the PSE environment and software. Each instance of a parallel program executes roughly the same program statement on a different data set at the same time. To execute an application, issue a command that specifies the name of the program (for example: my_program ) to be run and any PSE command-line options that are necessary.

The syntax for the command line is as follows:

% my_program [-pse_option value  . . .  ]

Command-line options override selections set by PSE environment variables. In all cases, selections must not conflict with the PSE cluster definitions set by the PSE system administrator. For example, execution priority cannot be set higher than the value of PSE_PRIORITY_LIMIT .

Several other requirements must be met for an application to execute successfully:

The following sections describe the stages of parallel program startup, execution, and termination.

For More Information:

8.3.1 Application Startup and Peer Selection

This section describes parallel program startup.

8.3.1.1 Controlling Process

The execution of an application creates a single instance of the program on the local machine, whether that machine is a PSE cluster member or not. This program instance is known as the controlling process. It does not participate in the actual execution of the application other than to provide control and I/O services. The controlling process:

For More Information:

8.3.1.2 I/O Manager

Once communication and a working environment is established between the application controlling process and the io_ manager , the io_manager s start the required number of peer processes. Once peer processes are started, each io_manager works on their behalf to manage standard I/O. It is also responsible for propagating signals between the controlling process and the local peers. As application peers exit, the io_manager reports their exit status back to the controlling process for final processing.

The io_manager s have an important role in job slot management. They are responsible for reporting application startups and exits to the local farmd daemon daemon to decrement /increment (respectively) the number of available job slots. An io_manager can also make requests of the farmd daemon daemon for additional services, such as requiring the root privileges of farmd to raise (or lower) the application execution priority.

For More Information:

8.3.1.3 Environment Propagation

The successful execution of any application is dependent upon being run within a well known context. This context is referred to as the user environment and consists primarily of a:

You generally set resource limits or environment variables at login time. These can be modified at any time after the initial login. To maintain this environment, the default PSE behavior is to propagate the above context to each application peer at startup time. This behavior can be overridden by using the -login command-line option to allow the user environment to be set by their shell initialization file, for example, .cshrc .


Note
X11 users need to pay attention to their default DISPLAY setup. Standard initialization of the X11 DISPLAY environment variable is the display on the local host; ":0.0". This can cause application failures under PSE, since applications are generally are run on remote hosts. These failures can be avoided by setting the DISPLAY variable to the specific host name on which you want to see output.

8.3.1.4 Peer Ordering

Peer order is based by default on machine load averages. The default sequence is least loaded (Peer 0) to a greater or equal load. The first peer placed is termed "Peer 0". All other peers are numbered sequentially from that point up to n-1, where n is the number of selected peers.

While each of the peers performs the parallel execution of the application, Peer 0 also has the following additional responsibilities:

User specified peer ordering using the -on command line option is an alternative to the default load-balanced peer selection. The -on command-line option assigns sequential peer numbers to the identified list of hosts. All identified hosts must be members of the specified partition.

PSE, by default, attempts to assign at most a single application peer to each CPU (Central Processing Unit) in the partition. Each CPU in a system is counted separately. This mode of assigning peer processes is referred to as physical mode peer selection.

When you specify virtual-mode peer selection (using the - virtual command-line option or the PSE_MACHINE environment variable), PSE assigns more than one peer to a CPU when necessary.

The actual number of peer processes per machine is balanced according to the number of processes requested and the number of available CPUs. Peers are assigned sequentially, as follows:

  1. The least loaded machine in the partition is assigned one peer per CPU.
  2. The next least loaded machine in the partition is assigned one peer per CPU.
  3. Step 2 is repeated until each CPU in the partition has been assigned one peer.
  4. [Virtual mode only] If any peers remain to be assigned after each CPU has one peer, then steps 1 through 3 are repeated until all peers have been assigned.

For example, an application running in a partition consisting of a two-processor SMP computer and a single-processor workstation may request 7 peers. If the SMP computer has a lighter load than the workstation, the SMP computer will be assigned peers 0, 1, 3, 4, and 6. The workstation will be assigned peers 2 and 5.

For More Information:

8.3.1.5 Terminal I/O

The controlling process is connected to each peer by an io_ manager . All terminal handling operations are assigned to Peer 0. Under normal conditions, PSE propagates terminal input from the controlling process only to HPF Peer 0. When running a - debug session however, input (usually debugger commands) is broadcast to all peers for processing.

Standard I/O handling under PSE differs in two areas from the UNIX standard:

For More Information:

8.3.1.6 Signal Propagation

Under PSE, signals which are received by the controlling process are propagated to all application peers for processing. The following set of signals are "caught" and propagated:

Signals not identified above, are handled "locally". They generally cause a core dump and an application exit.

For More Information:

8.3.1.7 Core Files

The generation of core files occurs when an application encounters situations from which PSE is unable to recover. These core files can be large, if not limited by the corefilesize resource limit. The resulting data can be useful when debugging the application.

Because PSE generally runs multiple instances of an application at the same time from the same directory, the generation of multiple core files could overwrite one another. To prevent this occurrence, core files are generated in a set of subdirectories. One subdirectory is generated for each peer and the directories are named core.N , for example, core.0, core.1, . . . . By default the subdirectory is created in the current working directory. You may specify alternate file paths to use as a corefile root by setting the PSE_ CORE_DIR environment variable. There is no command line equivalent.

For More Information:

8.3.1.8 Peer Communications

Parallel HPF applications use message passing to coordinate and move required data between processes. The messaging fabric available to PSE users depends both upon the number and network adapter types installed on PSE cluster machines, as well as the PSE configuration.

The communications medium is chosen separately for each inter-peer route. For example, within a single execution, the messages between peer 1 and peer 2 may use a medium different from the medium used between peer 1 and peer 3.

By default, PSE chooses the highest-bandwidth medium available between any two peers. Although usually not needed, you can change the order of preference for choosing a communications medium in one of three ways:

The syntax of the PSE_PREF_COMM definition is as follows:

Environment variable (C shell):

% setenv PSE_PREF_COMM pref-comm-spec

Command-line option:

% my_program -pref_comm pref-comm-spec

pref-comm-spec is a list containing one or more communication-medium specifiers. The order in which the specifiers are listed determines PSE's preference in choosing a communications medium between any two peers. It is generally desirable to put higher-bandwidth media at the beginning of the list, and lower- bandwidth media at the end of the list. The specifiers currently supported are:

The default setting for PSE_PREF_COMM is:

PSE_PREF_COMM  shm,mc,atm,fddi,ethernet

Additional specifiers may be defined in the PSE cluster database. Users can view the available specifiers with lspart and psemon .

The communications medium and the protocol used over the network are controlled by two separate environment variables. The network medium is controlled by the PSE_PREF_COMM environment variable. The network protocol is controlled by the HPF_COMM environment variable and its associated command-line option, -c .

For More Information:

8.3.1.9 Program Termination

A program executes until it reaches a successful completion or until one or more of the peers exits with an error. The exit status of each peer process is reported to the application controlling process by its io_manager. If the application exit is caused by an error, the controlling process waits a predetermined amount of time to allow standard I/O to finish and then exits with an appropriate exit value. HPF applications implement a synchronized exit. If one or more peers are unable to report exit values within the allotted amount of time after the first exit, the application exits with an error.

8.3.1.10 Exit Value

The exit value of an application is zero unless errors are encountered. In the case of errors, the application exit value is defined to be the first nonzero value received by the controlling process.

8.3.2 Executing an Application from a Remote Host

A remote host is defined as any host which is not a PSE cluster member. Successful application execution from a remote host depends upon:

In all other respects, execution of a user program is exactly as previously described.

For More Information:

8.3.3 Executing a Parallel Application without PSE

Programs compiled with the -wsf option without an argument, or with an argument of 1, may be executed as normal scalar programs without using PSE, by running with the -local option, which forces the application to run with one peer process on the local machine.

-local does not work together with -debug . When this is needed, the best work-around is to run the program under the debugger like this:

% ladebug a.out
(ladebug) run -local

For More Information:

8.3.4 Debugging

PSE provides modified versions of the Ladebug and dbx debuggers. The default debugging environment provided is Ladebug in n windows: one X-window terminal session attached to a Ladebug session for each peer. dbx in n windows is also available.

Debugging sessions are invoked by running the PSE application with the -debug option. These tools let you view the activity of each peer while it is being debugged.

These modified versions of Ladebug and dbx let you view elements of a distributed HPF array using the hpfget command. The -debug option lets you:

You can select a different debugger by setting the DEBUGGER environment variable to the pathname of the desired debugging tool.

For More Information:

8.3.5 Profiling

The pprof profiler measures a parallel application's performance by capturing the amount of time the program spends in different code regions or execution activities. The information in pprof profiling reports can be used to locate hotspots and to identify aspects of program execution that might benefit from optimization. The profiler also provides system run-time performance statistics for processes running under PSE.

For More Information: