What is PVM?

PVM (Parallel Virtual Machine) is a by-product of an ongoing heterogeneous network computing research project involving the authors and their institutions. The general goals of this project are to investigate issues in, and develop solutions for, heterogeneous concurrent computing. PVM is an integrated set of software tools and libraries that emulates a general-purpose, flexible, heterogeneous concurrent computing framework on interconnected computers of varied architecture. The overall objective of the PVM system is to enable such a collection of computers to be used cooperatively for concurrent or parallel computation. Detailed descriptions and discussions of the concepts, logistics, and methodologies involved in this network-based computing process are contained in the remainder of the book. Briefly, the principles upon which PVM is based include the following:

User-configured host pool : The application's computational tasks execute on a set of machines that are selected by the user for a given run of the PVM program. Both single-CPU machines and hardware multiprocessors (including shared-memory and distributed-memory computers) may be part of the host pool. The host pool may be altered by adding and deleting machines during operation (an important feature for fault tolerance).

Translucent access to hardware: Application programs either may view the hardware environment as an attributeless collection of virtual processing elements or may choose to exploit the capabilities of specific machines in the host pool by positioning certain computational tasks on the most appropriate computers.

Process-based computation: The unit of parallelism in PVM is a task (often but not always a Unix process), an independent sequential thread of control that alternates between communication and computation. No process-to-processor mapping is implied or enforced by PVM; in particular, multiple tasks may execute on a single processor.

Explicit message-passing model: Collections of computational tasks, each performing a part of an application's workload using data-, functional-, or hybrid decomposition, cooperate by explicitly sending and receiving messages to one another. Message size is limited only by the amount of available memory.

Heterogeneity support: The PVM system supports heterogeneity in terms of machines, networks, and applications. With regard to message passing, PVM permits messages containing more than one data type to be exchanged between machines having different data representations.

Multiprocessor support: PVM uses the native message-passing facilities on multiprocessors to take advantage of the underlying hardware. Vendors often supply their own optimized PVM for their systems, which can still communicate with the public PVM version.

The PVM system is composed of two parts. The first part is a daemon , called pvmd3 and sometimes abbreviated pvmd , that resides on all the computers making up the virtual machine. (An example of a daemon program is the mail program that runs in the background and handles all the incoming and outgoing electronic mail on a computer.) Pvmd3 is designed so any user with a valid login can install this daemon on a machine. When a user wishes to run a PVM application, he first creates a virtual machine by starting up PVM. The PVM application can then be started from a Unix prompt on any of the hosts. Multiple users can configure overlapping virtual machines, and each user can execute several PVM applications simultaneously.

The second part of the system is a library of PVM interface routines. It contains a functionally complete repertoire of primitives that are needed for cooperation between tasks of an application. This library contains user-callable routines for message passing, spawning processes, coordinating tasks, and modifying the virtual machine.

The PVM computing model is based on the notion that an application consists of several tasks. Each task is responsible for a part of the application's computational workload. Sometimes an application is parallelized along its functions; that is, each task performs a different function, for example, input, problem setup, solution, output, and display. This process is often called functional parallelism . A more common method of parallelizing an application is called data parallelism . In this method all the tasks are the same, but each one only knows and solves a small part of the data. This is also referred to as the SPMD (single-program multiple-data) model of computing. PVM supports either or a mixture of these methods. Depending on their functions, tasks may execute in parallel and may need to synchronize or exchange data, although this is not always the case. An exemplary diagram of the PVM computing model is shown in Figure A. and an architectural view of the PVM system, highlighting the heterogeneity of the computing platforms supported by PVM, is shown in Figure B.

The PVM system currently supports C, C++, and Fortran languages. This set of language interfaces have been included based on the observation that the predominant majority of target applications are written in C and Fortran, with an emerging trend in experimenting with object-based languages and methodologies.

The C and C++ language bindings for the PVM user interface library are implemented as functions, following the general conventions used by most C systems, including Unix-like operating systems. To elaborate, function arguments are a combination of value parameters and pointers as appropriate, and function result values indicate the outcome of the call. In addition, macro definitions are used for system constants, and global variables such as errno and pvm_errno are the mechanism for discriminating between multiple possible outcomes. Application programs written in C and C++ access PVM library functions by linking against an archival library (libpvm3.a) that is part of the standard distribution.

Fortran language bindings are implemented as subroutines rather than as functions. This approach was taken because some compilers on the supported architectures would not reliably interface Fortran functions with C functions. One immediate implication of this is that an additional argument is introduced into each PVM library call for status results to be returned to the invoking program. Also, library routines for the placement and retrieval of typed data in message buffers are unified, with an additional parameter indicating the datatype. Apart from these differences (and the standard naming prefixes - pvm_ for C, and pvmf for Fortran), a one-to-one correspondence exists between the two language bindings. Fortran interfaces to PVM are implemented as library stubs that in turn invoke the corresponding C routines, after casting and/or dereferencing arguments as appropriate. Thus, Fortran applications are required to link against the stubs library (libfpvm3.a) as well as the C library.

All PVM tasks are identified by an integer task identifier (TID) . Messages are sent to and received from tids. Since tids must be unique across the entire virtual machine, they are supplied by the local pvmd and are not user chosen. Although PVM encodes information into each TID the user is expected to treat the tids as opaque integer identifiers. PVM contains several routines that return TID values so that the user application can identify other tasks in the system.

There are applications where it is natural to think of a group of tasks . And there are cases where a user would like to identify his tasks by the numbers 0 - (p - 1), where p is the number of tasks. PVM includes the concept of user named groups. When a task joins a group, it is assigned a unique ``instance'' number in that group. Instance numbers start at 0 and count up. In keeping with the PVM philosophy, the group functions are designed to be very general and transparent to the user. For example, any PVM task can join or leave any group at any time without having to inform any other task in the affected groups. Also, groups can overlap, and tasks can broadcast messages to groups of which they are not a member.  To use any of the group functions, a program must be linked with libgpvm3.a .

The general paradigm for application programming with PVM is as follows. A user writes one or more sequential programs in C, C++, or Fortran 77 that contain embedded calls to the PVM library. Each program corresponds to a task making up the application. These programs are compiled for each architecture in the host pool, and the resulting object files are placed at a location accessible from machines in the host pool. To execute an application, a user typically starts one copy of one task (usually the ``master'' or ``initiating'' task) by hand from a machine within the host pool. This process subsequently starts other PVM tasks, eventually resulting in a collection of active tasks that then compute locally and exchange messages with each other to solve the problem. Note that while the above is a typical scenario, as many tasks as appropriate may be started manually. As mentioned earlier, tasks interact through explicit message passing, identifying each other with a system-assigned, opaque TID.

Shown in Figure C is the body of the PVM program hello, a simple example that illustrates the basic concepts of PVM programming. This program is intended to be invoked manually; after printing its task id (obtained with pvm_mytid()), it initiates a copy of another program called hello_other using the pvm_spawn() function. A successful spawn causes the program to execute a blocking receive using pvm_recv. After receiving the message, the program prints the message sent by its counterpart, as well its task id; the buffer is extracted from the message using pvm_upkstr. The final pvm_exit call dissociates the program from the PVM system.

Figure D: PVM program hello_other.c

Figure D is a listing of the ``slave'' or spawned program; its first PVM action is to obtain the task id of the ``master'' using the pvm_parent call. This program then obtains its hostname and transmits it to the master using the three-call sequence - pvm_initsend to initialize the send buffer; pvm_pkstr to place a string, in a strongly typed and architecture-independent manner, into the send buffer; and pvm_send to transmit it to the destination process specified by ptid, ``tagging'' the message with the number 1.

back

 

Scientific Computing Example

Here is an example of a scientific computing application for a Beowulf cluster. The same shows us how to solve for the value of using the PVM libraries and the C programming Language.

This Master/Worker code demonstrates the bag of tasks programming paradigm. The value of pi is approximated by:

Getting the integral of the function 4/(1+x2) and evaluating it over 0 to 1

The worker tasks are spawned, and while more intervals remain, the master sends a worker an interval and a point between 0 and 1. This point is used to compute the height of a rectangle that is as wide as the interval. Workers compute the area of the rectangle and send the area back to the master who sums up these areas. When all intervals are computed, the master sends an interval value of -1 to signal completion.

This is the source code of the master/node programs for computing the value of pi.

// file : comppi.master.c

#include <stdio.h>

#include "pvm3.h"

#define NTASKS 5

#define INTERVALS 1000

main ()

{

int mytid;

int tids[NTASKS];

float sum=0.0, area;

float width;

int i, numt, msgtype, bufid, bytes, who, int_num;

/* Enroll in PVM */

mytid = pvm_mytid();

/* spawn off NTASKS workers */

numt = pvm_spawn("comppi.worker", NULL, PvmTaskDefault, "", NTASKS,

tids);

if (numt != NTASKS) {

printf ("Error in spawning, numt= %d\n",numt);

exit(0);

}

/* compute interval width */

width = 0.0;

i = 0;

/* Multi-cast initial dummy message to workers */

msgtype = 0;

pvm_initsend(PvmDataDefault);

pvm_pkint(&i, 1, 1);

pvm_pkfloat(&width, 1, 1);

pvm_mcast(tids, NTASKS, msgtype);

width = 1.0 / INTERVALS;

/* for each interval, 1) receive area from worker

2) add area to sum

3) send worker new interval number and width */

for (i = 0; i < INTERVALS; i++) {

int_num = i+1;

bufid = pvm_recv(-1,-1);

pvm_bufinfo(bufid, &bytes, &msgtype, &who);

pvm_upkfloat(&area, 1, 1);

sum += area;

pvm_initsend(PvmDataDefault);

pvm_pkint(&int_num, 1, 1);

pvm_pkfloat(&width, 1, 1);

pvm_send(who, msgtype);

}

int_num = -1; /* Signal to workers that tasks are done */

pvm_initsend(PvmDataDefault);

pvm_pkint(&int_num, 1, 1);

pvm_pkfloat(&width, 1, 1);

/* Collect the last NTASK areas and send the completion signal */

for (i = 0; i < NTASKS; i++) {

bufid = pvm_recv(-1,-1);

pvm_bufinfo(bufid, &bytes, &msgtype, &who);

pvm_upkfloat(&area, 1, 1);

sum += area;

pvm_send(who, msgtype);

}

printf("Computed value of Pi is %8.6f\n",sum);

pvm_exit();

}

// file : comppi.slave.c

// author: C. Breshears

#include <stdio.h>

#include "pvm3.h"

#define f(x) ((float)(4.0/(1.0+x*x)))

main ()

{

int mytid, master;

float area;

float width, int_val, height;

int int_num;

/* Enroll in PVM */

mytid = pvm_mytid();

/* who is sending me work? */

master = pvm_parent();

/* receive first job from the master */

pvm_recv(-1, -1);

pvm_upkint(&int_num, 1, 1);

pvm_upkfloat(&width, 1, 1);

/* While I've not been sent the signal to quit, I'll keep processing */

while (int_num != -1) {

/* compute interval value from interval number */

int_val = int_num * width;

/* compute height of given rectangle */

height = f(int_val);

/* compute area */

area = height * width;

/* send area back to master */

pvm_initsend(PvmDataDefault);

pvm_pkfloat(&area, 1, 1);

pvm_send(master, 9);

/* Wait for next job from master */

pvm_recv(-1, -1);

pvm_upkint(&int_num, 1, 1);

pvm_upkfloat(&width, 1, 1);

}

/* all done */

pvm_exit();

}

back