17.2 PSE Cluster daemon

For a host to become a member of a PSE cluster, the host name must be registered in a PSE cluster partition in the PSE cluster database, and a PSE cluster daemon, farmd, must run on that host. A host can be a member of several PSE clusters. If so, there are multiple instances of farmd running for each PSE cluster.

17.2.1 Adding a Machine to a Cluster

This section describes the steps involved to add and configure a new host into a Cluster.

  1. Install the PSE kit on the new host if it has not been installed.
  2. If you are using a database, use psedbedit to add the new host name into the PSE cluster database and then continue with step 3. If you are not using a database the new machine will automatically join your existing basic PSE cluster (psefarm).
  3. Run the following command from the new host:
    # /usr/sbin/pseconfig add cluster-spec
    

    where cluster-spec is psefarm for a basic PSE cluster, the clustername for a DNS-based PSE cluster, or the PSE cluster database file name, including directory path, for a file-based PSE cluster.

pseconfig does the following:

When pseconfig completes, the farmd daemon should be started automatically by the inetd daemon as soon as a connection request arrives to the clustername IP port. To immediately starts the farmd daemon, run the lspart -farm cluster-spec -jobslots command. This command connects to the local farmd daemon and causes inetd to start it.

17.2.2 Removing a Machine from a Cluster

You can temporarily deconfigure a host from participating as a member of a Cluster by running the following command:

# /usr/sbin/pseconfig delete cluster-spec

where cluster-spec is psefarm for a basic PSE cluster, the clustername for a DNS-based PSE cluster, or the PSE cluster database file name, including directory path, for a file-based PSE cluster.

The pseconfig command performs the following:

To reconfigure the host to be a member of a PSE cluster, issue the following command:

# pseconfig add cluster-spec

Note that deconfiguring a host from a PSE cluster forces the host to stop accepting PSE cluster application execution request on that host. However, PSE cluster applications currently running continue to run until the application itself terminates.

Also note that manually killing the farmd daemon by using the kill command terminates the currently running farmd daemon, but as soon as a new connection requests arrive, the inetd daemon starts a new farmd daemon.

To permanently remove a machine form a Cluster do the following:

17.2.3 Maintaining the farmd Daemon

This section describes the tasks required to maintain the PSE cluster daemon.

17.2.3.1 Signaling the farmd Daemon to Update Its Information

To immediately signal a farmd daemon to reread a modified PSE cluster database, run the following command on any of the PSE cluster members:

# kill -HUP cat /var/run/farmd.clustername.pid

The signaled farmd domain propagates the SIGHUP signal to all other farmd daemons in the PSE cluster.


Note
Any changes to a DNS-based PSE cluster database may not be immediately propagated to the secondary PSE cluster DNS servers. Two farmd daemons on two different PSE cluster members might read the PSE cluster database from different DNS servers, such as PSE cluster members that have different /etc/resolv.conf files. To ensure consistency, make sure the secondary PSE cluster DNS servers have received this new database before signaling the farmd daemon.

17.2.3.2 Modifying Job Slots

Job slots are the number of simultaneous PSE cluster applications a farmd daemon allows at any given time for a given PSE cluster. The maximum number of job slots per given PSE cluster is maintained in the /etc/rc.config file under the variable PSE_SCHEDULING_UNITS. By default the number is 10. You can change this default number by running the following command:

# pseconfig -j n

where n is the new number to take effect.

By modifying the maximum number of job slots you can vary the potential system load that are caused by running PSE cluster applications. Setting the maximum number of job slots to 0 prevents the farmd daemons from accepting future PSE cluster applications. Refer to the following section on how to temporarily achieve the same effect as setting the job slots to 0.

17.2.3.3 Temporarily Disabling and Enabling farmd

In some cases it is desirable to temporarily disable the farmd daemon from accepting a PSE cluster application execution request. For example, when a PSE cluster member is being used for some other purpose and should not be affected by the increase in system load from any PSE cluster applications. To disable a farmd daemon, issue the following command:

# kill -USR1 `cat /var/run/farmd.clustername.pid`

To reenable the PSE cluster daemon, use the following command:

# kill -USR2 `cat /var/run/farmd.clustername.pid`

17.2.3.4 Shutting Down a PSE Cluster Member

When a host is a member of a PSE cluster, exercise caution when you want to shut down the host for any reason. Even though no users are logged in to the host, there might be a PSE cluster application currently running on the host. Shutting down the system may severely disrupt a currently running PSE cluster application, even if it was not started at this host.

Follow this sequence to shut down a PSE cluster member:

  1. Before shutting down the system, you should disable the farmd daemon as explained in Section 17.2.3.3.
  2. Determine whether there is a PSE cluster application running on the system currently by using the pspart command. Refer to pspart(1) for details.

    For example, use the following command to determine if any PSE cluster applications are running on your local host:

    # pspart -localhost
    

    If there is a PSE cluster application currently running, you should wait until the application terminates before shutting down the system.

  3. Shut down the PSE cluster member using the shutdown(8) command.

17.2.3.5 System Boot and Shutdown

After a host has been configured to become a member of some PSE clusters, one farmd daemon per PSE cluster starts at system boot. The starting of each farmd daemon at system boot is done by the /sbin/rc3.d/S90farmd shell script. Similarly, the shutting down of each farmd daemon at system shutdown is done by the /sbin/rc2.d /K29farmd shell script.

Note that Digital UNIX provides a special convention for file naming in the run command directory structure (/sbin/rc0, /sbin/rc2, and /sbin/rc3). A prefix of either "K" or "S" to the command names in these directories determines whether the system starts or stops these command. For further information, refer to Digital UNIX (DEC OSF/1) guide to system administration.

The S90farmd and K29farmd shell scripts reads the /etc/rc.config file for any PSE cluster configuration information. The relevant variables in this file are:
Variable  Explanation 
PSE_CONF = {YES|NO}  Whether the host is configured to be a PSE cluster member 
PSE_ SCHEDULING_UNITS  The maximum number of job slots per PSE cluster 
PSE_CONFIGURED_DOMAINS=n   The number of PSE clusters for which the host is configured 
PSE_FARM_DOMAIN_1  clustername1, the first PSE cluster name 
PSE_FARM_DOMAIN_n  clusternamen, the nth PSE cluster name 

17.2.3.6 Monitoring farmd Activity for the Entire PSE Cluster

The PSE software provides the psemon utility to monitor PSE activity. The psemon utility displays PSE cluster configuration and load information. Refer to the psemon(1) reference page for details.