|
|
Revision History: Revision 2.3, Published 2006/10/27 10:45:03 (UTC) by Bernd Kallies Table of Contents
Parallelized applications use at least one of the parallel programming paradigms, namely message passing (MPI, LAPI, PVM), or shared-memory parallel (SMP). Depending on which paradigm is used, the process of compilation and running the application is different. In addition, programming paradigms are coupled to specific hardware requirements. There also exist differences for running a parallel application interactively and in batch. In particular, applications that use message passing can run on more than one node and communicate via a network. On the other hand, shared-memory parallel applications can run on one node only. They do not need a network. Note
If you are not sure to which type of parallel program your application belongs, consult the appropriate program manual, then follow the instructions given below. You should also know the current configuration of the HLRN machines and understand at least terms such as node, task, thread, network. Parallel programs that use message passing (MPI, LAPI, PVM) are handled by the Parallel Operating Environment (POE) on IBM SP machines. The POE consists of
The poe command enables the user to load and execute programs on different nodes. The poe acts like mpirun or mpprun on other platforms. When you start a program with poe, you want to load a number of instances of it (tasks) on the resources you requested. The resources may be on one node or on different nodes. To do this, follow the poe command with the program name and any options. These options can include program options, followed by any of command line options of poe. If the program was compiled and linked with the mpXXX compilers, the poe command can be omitted. Thus, the following two commands are equivalent for these programs: $ poe a.out [options to a.out ...] [poe options ...] $ a.out [options to a.out ... ] [poe options ...] The poe command takes command line options that define resources like number of MPI tasks or network. These options are given on the command line when running an MPI program interactively. Alternatively, they can be set via environment variables. POE environment variables share the common prefix MP_. These options take defaults if they are not given. When running an MPI program in batch using LoadLeveler, many of the poe options become overridden by corresponding LoadLeveler keywords. Table 7.1 shows the most important poe options to get started. Table 7.1. Important poe options
The following examples comment on the use of poe interactively or from within a batch job. Chapter 10, Examples gives additional working examples including short source codes for sample programs. Example 7.1 shows how to invoke instances of the uname command on different nodes. Three variants are given, which do the same. The first calls poe interactively with command line flags. The second calls poe interactively with environment variables. The third calls poe from within a LoadLeveler script. Example 7.1. Usage of poe to start instances of a non-parallel program Interactive, poe command line flags: $ poe uname -n -s -rmpool 0 -nodes 1 -tasks_per_node 2 -labelio yes -stdoutmode ordered LoadLeveler script: #!/bin/ksh # @ job_type = parallel # @ node = 1 # @ tasks_per_node = 2 # @ resources = ConsumableCpus(1) ConsumableMemory(16 mb) # @ output = poe_ex1.llout # @ error = $(output) # @ class = cdev # @ queue poe uname -n -s -labelio yes -stdoutmode ordered The output (variant 1 and 2: stdout, variant 3: file poe_ex1.llout) is something like 0:AIX hreg02a-en0 1:AIX hreg02a-en0 Example 7.2 shows how to invoke an MPI program with a number of total tasks. The control on how the tasks are allocated is given to LoadLeveler. Use of the HPS adapters will be requested. The shown environment settings are critical for getting performance. The shown input is appropriate for the majority of MPI applications. It is assumed that the executable is named a.out, and that it needs an input file called input.in as argument. Example 7.2. Usage of poe to start an MPI program Interactive: $ export MP_SHARED_MEMORY=yes $ export MP_WAIT_MODE=poll $ export MP_SINGLE_THREAD=yes $ a.out input.in -rmpool 0 -procs 6 -euidevice sn_all -euilib us LoadLeveler script: #!/bin/ksh # @ job_type = parallel # @ total_tasks = 6 # @ blocking = unlimited # @ network.mpi = sn_all,,us # @ resources = ConsumableCpus(1) ConsumableMemory(1968 mb) # @ output = poe_ex2.llout # @ error = $(output) # @ environment = MEMORY_AFFINITY=MCM; \ # MP_SHARED_MEMORY=yes; MP_WAIT_MODE=poll; \ # MP_SINGLE_THREAD=yes; MP_TASK_AFFINITY=MCM # @ wall_clock_limit = 70,60 # @ node_usage = shared # @ queue ./a.out input.in Parallel programs that can run shared-memory in parallel usually spawn a number of POSIX threads. The number of threads and their behaviour is defined usually at runtime via environment variables, or by options specific to the application. Note
Consult the appropriate documentation of your application to find out how to do the setup. Most SMP applications follow the OpenMP standard. Their runtime behaviour depends on the setting of some environment variables. There exist the standardized OMP_XXXX variables. On AIX, there also exists a set of XLSMPOPTS settings that do the same. If both are set, the OpenMP variable takes precedence. Please see the following documents for a more detailed description of POE and LoadLeveler:
2003-2008 © Norddeutscher Verbund für Hoch- und Höchstleistungsrechnen (HLRN) |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||