Hyperion Cluster: Using Mathematica

Using Mathematica

Starting with version 7, Mathematica supports running parallel jobs on a cluster using SGE.

To use Mathematica with the Hyperion cluster, you need to define the environment variables CLASSPATH and LD_LIBRARY_PATH with the correct values for SGE before starting Mathematica:
csh/tcsh syntax:

setenv CLASSPATH "/usr/local/share/ge2011.11/lib/drmaa.jar:/usr/local/share/ge2011.11/lib/jgdi.jar:/usr/local/share/ge2011.11/lib/juti.jar"
setenv LD_LIBRARY_PATH "/usr/local/share/ge2011.11/lib/linux-x64"

bash syntax:

export CLASSPATH=/usr/local/share/ge2011.11/lib/drmaa.jar:/usr/local/share/ge2011.11/lib/jgdi.jar:/usr/local/share/ge2011.11/lib/juti.jar
export LD_LIBRARY_PATH=/usr/local/share/ge2011.11/lib/linux-x64

Next, you need to a file named .sge_request in your home directory, or edit it if it already exists and specify a value for h_rt (run time) in it, like this:

-l "h_rt=00:30:00"

The exact value you set h_rt to will depend on your estimated runtime for your job. the format for the time is HH:MM:SS, so this example specifies an h_rt of 30 minutes, and is the equivalent of the qsub command

qsub -l "h_rt=00:30:00"

Creating this .sge_request file is needed because the hyperion cluster requires that you specify a runtime, but because of the way Mathematica interfaces with SGE, you cannot specify your runtime from within Mathematica. For more information, see the man page for sge_request, and the section on wallclock time on the Using SGE page

Once those steps are done, you can start Mathematica with the command 'mathematica'.

In your Mathematica program, you need to add the following commands at the start to setup SGE:

Needs["ClusterIntegration`"]
kernels = LaunchKernels[SGE["hyperion.sns.ias.edu", XX]]

where XX is the number of processors you want to use. Notice that there is an unpaired backquote at the end of 'ClusterIntegration' in the 'Needs' statement. That is not a typo. That single backquote is needed.

At the end of your parallel section, you MUST have this command:

CloseKernels[]

Failure to add the CloseKernels command will leave the processes running on the cluster, which is bad.

Here's an example Mathematica program ("notebook"):

Needs["ClusterIntegration`"]
kernels = LaunchKernels[SGE["hyperion.sns.ias.edu", 4]]
ParallelMap[FactorInteger, (10^Range[20, 30] - 1)/9]
CloseKernels[]

That's all that's needed to it to run a job through SGE using the Mathematica GUI.

If you'd like to submit a job to the cluster from the command-line, you can put the mathematica commands in a text file. The you can create a job script that runs "math" with the above file as an input. Note that the above environment variables are defined in this script, and use SGE's -V switch:

#!/bin/bash
#$ -V
#$ -cwd
#$ -m abe

export
CLASSPATH=/usr/local/share/ge2011.11/lib/drmaa.jar:/usr/local/share/ge2011.11/lib/jgdi.jar:/usr/local/share/ge2011.11/lib/juti.jar
export LD_LIBRARY_PATH=/usr/local/share/ge2011.11/lib/linux-x64

/usr/local/share/bin/math < factor.m

You can then submit this script to the queuing system using the 'qsub' command:

qsub factor.sh