CS/ECE 757 Fall 2000
Homework 2: Beowulf Bakeoff
Due: In class Tuesday 11/21/2000
Last Updated: 11/15/00
For the Beowulf Bakeoff, the class will break into three groups of four people.
Each group will build a cluster made up of six nodes and each node will contain
two processors. At the end of the project, we will connect the three clusters
together to form a 20 node, 40 processor cluster. We will pick the cluster to
use as the base for the large cluster based on the features that the group
implemented, performance, scalability, maintainability, etc. The group that
"wins" (i.e., the group that builds the cluster that we choose to use as the
base of the large cluster) gets the six Celeron processors that we started
building with as a prize.
Hand in:
Below is a list of things that need to be handed in:
- List of implemented features and description of features
- User's guide of how to use features of cluster, how to scale cluster
(i.e., how to add nodes), etc.
- Graphs of performance of cluster
Features:
A minimal cluster must have the following features:
- A master node with a disk and five diskless slave nodes
- MPI
- Use to ECE's Yellow Pages server
- NFS mount ECE home directories
You may also wish to install the X libraries, since you will need this for
the programming assignment (homework 3).
Performance Analysis:
For the project, you must run the following performance tests and produce
graphs for each. These performance tests are from MPI (if your version of MPI
did not come with these test programs, you can get them by downloading MPICH
from the MPI web page. These
tests are in "mpich/examples/perftest". Note that results from these tests
generally vary each time they are run, so you should run each test multiple
times and report the average results.
- Run a stress test on the cluster using the following command:
mpirun -np <# of procs> stress -start 102400 204800 3200
This measures the bandwidth of the system by sending a large
message (size from 400K to 800K, increasing by 12.5K each step) from
each processor in turn to every other processor. Vary the number of
processors from 2-12 and graph the average aggregate bandwidth for
each case.
- Run the following test on the cluster using the following command:
mpirun -np <# of procs> tcomm
This measures the speed of each communication channel in the system
by having each processor send a small token (on the order of a word or
two) to its neighbors. By doing this one both gets a measurement of
the overhead of sending messages and also see if different links are
different speeds. If all of links are about the same speed, tcomm
will simply report the average bandwidth. However, if there are
large differences in the speed, tcomm will print out a histogram of
the speeds. Vary the number of processors and graph the average
bandwidth for each case. In the cases where tcomm finds that there
is a large difference in speed, report the differences that tcomm
finds.
Feel free to also hand in graphs for any other performance tests that you
perform, as these will help us choose which cluster to use as the base. You
may wish to perform many of these, since they will give you a better idea of
the behavior and performance of your cluster, which will give you some insight
into how to write the parallel program for the next assignment.