Matlab

Contents

  1. Summary and Version Information
  2. Running a MATLAB script from the command line
  3. MATLAB and HPC
  4. Built-in multithreaded functions
  5. MATLAB Parallel Computing Toolbox
  6. MATLAB Parallel Server/Distributed Computing Server
    1. Instructions on configuring Parallel Server/MDCS
    2. Instructions on using Parallel Server/MDCS
    3. Links to additional information
  7. Installing add-ons/packages/etc
  8. External resources
    1. Tutorials on using MATLAB from MathWorks

Summary and Version Information

Package Matlab
Description Matlab
Categories Numerical Analysis
Version Module tag Availability* GPU
Ready
Notes
2009b matlab/2009b Non-HPC Glue systems
All OSes
Y
2010b matlab/2010b Non-HPC Glue systems
Evergreen HPCC
Linux
Y
2011a matlab/2011a Non-HPC Glue systems
Bswift HPCC
Linux
Y
2011b matlab/2011b Non-HPC Glue systems
Evergreen HPCC
Bswift HPCC
Linux
Y
2012b matlab/2012b Non-HPC Glue systems
Bswift HPCC
Linux
Y
2013b matlab/2013b Non-HPC Glue systems
RedHat6
Y
2014a matlab/2014a Non-HPC Glue systems
RedHat6
Y
2014b matlab/2014b Non-HPC Glue systems
Deepthought HPCC
Bswift HPCC
Deepthought2 HPCC
RedHat6
Y
2015a matlab/2015a Non-HPC Glue systems
RedHat6
Y
2015b matlab/2015b Non-HPC Glue systems
Deepthought HPCC
Deepthought2 HPCC
RedHat6
Y
2016a matlab/2016a Non-HPC Glue systems
Deepthought HPCC
Deepthought2 HPCC
RedHat6
Y
2016b matlab/2016b Non-HPC Glue systems
Deepthought HPCC
Deepthought2 HPCC
RedHat6
Y
2017a matlab/2017a Non-HPC Glue systems
Deepthought HPCC
Deepthought2 HPCC
RedHat6
Y
2018a matlab/2018a Non-HPC Glue systems
RedHat6
Y
2018b matlab/2018b Non-HPC Glue systems
RedHat6
Y
2019a matlab/2019a Non-HPC Glue systems
RedHat6
Y
2019b matlab/2019b Non-HPC Glue systems
RedHat6
Y

Notes:
*: Packages labelled as "available" on an HPC cluster means that it can be used on the compute nodes of that cluster. Even software not listed as available on an HPC cluster is generally available on the login nodes of the cluster (assuming it is available for the appropriate OS version; e.g. RedHat Linux 6 for the two Deepthought clusters). This is due to the fact that the compute nodes do not use AFS and so have copies of the AFS software tree, and so we only install packages as requested. Contact us if you need a version listed as not available on one of the clusters.

In general, you need to prepare your Unix environment to be able to use this software. To do this, either:

  • tap TAPFOO
OR
  • module load MODFOO

where TAPFOO and MODFOO are one of the tags in the tap and module columns above, respectively. The tap command will print a short usage text (use -q to supress this, this is needed in startup dot files); you can get a similar text with module help MODFOO. For more information on the tap and module commands.

For packages which are libraries which other codes get built against, see the section on compiling codes for more help.

Tap/module commands listed with a version of current will set up for what we considered the most current stable and tested version of the package installed on the system. The exact version is subject to change with little if any notice, and might be platform dependent. Versions labelled new would represent a newer version of the package which is still being tested by users; if stability is not a primary concern you are encouraged to use it. Those with versions listed as old set up for an older version of the package; you should only use this if the newer versions are causing issues. Old versions may be dropped after a while. Again, the exact versions are subject to change with little if any notice.

In general, you can abbreviate the module tags. If no version is given, the default current version is used. For packages with compiler/MPI/etc dependencies, if a compiler module or MPI library was previously loaded, it will try to load the correct build of the package for those packages. If you specify the compiler/MPI dependency, it will attempt to load the compiler/MPI library for you if needed.

Running a MATLAB script from the command line

While most people use MATLAB interactively, there are times when you might wish to run a MATLAB script from the command line. Or from within a shell script. Usually in this situation, you have a file containing MATLAB commands, one command per line, and you want to start up MATLAB, run the commands in that file, and save the output to another file, and you do not want the MATLAB GUI starting up (often times, the process will be running in a fashion where there might not be a screen readily available to display the GUI stuff).

WARNING
If you are running Matlab jobs on one of the Deepthought or Juggernaut high-performance computing clusters, please include a #SBATCH -L matlab directive near the top of your job script. This is because we have been having issues with HPC users depleting the campus Matlab license pool. The above directive will ask Slurm for a matlab license, which will be used to throttle the number of simultaneous Matlab jobs running on the clusters. If all the matlab users on the cluster abide by this policy, hopefully there will be no more issues with license depletion. If such an issue occurs, we will regrettably have to kill some matlab jobs (starting with those that did NOT request a license) to free up licenses. We are hoping in the next several months to obtain a truly unlimited matlab license on campus, but until then we ask that HPC users include the above directive in their matlab jobs.

This can be broken down into several distinct parts:

  1. Get MATLAB to run without the GUI, etc.
  2. Get MATLAB to start running your script, and exit when your script is done.
  3. Get the output of the MATLAB command saved to a file.

The first part is handled with the following options to be passed to the MATLAB command: -nodisplay and -nosplash. The first disables the GUI, the latter disables the MATLAB splash screen that gets displayed before the GUI starts up.

The second step is handled using the -r option, which specifies a command which MATLAB should run when it starts up. You can give it any valid MATLAB command, but typically you just want to tell it to read commands from your file. And then you want to tell it to exit; otherwise it will just sit at the prompt waiting for additional commands. One reason to keep it simple like that is that the command string has to be quoted to keep the Unix shell from interpretting it, and that can get tricky for complicated commands.

Typically, you would give an argument like matlab -r "run('./myscript.m'); exit" (and you would include the -nodisplay and -nosplash arguments before the -r if you wanted to disable the GUI as well); where myscript.m is your script file, and is located in the current working directory. The exit causes MATLAB to exit once the script completes.

The third part is handled with standard Unix file redirection.

Putting it all together, if you had a script myscript.m in the directory ~/my-matlab-stuff, and you want to run it from a shell script putting the output in myscript.out in the same directory, you could do something like

#!/bin/tcsh

module load matlab
cd ~/my-matlab-stuff
matlab -nodisplay -nosplash -r "run('~/myscript.m'); exit" > ./myscript.out

MATLAB and HPC

Mathworks currently provides two products to help with parallelization:

  1. Parallel Computing Toolkit (PCT): This provides support for parallel for loops (the parfor command), as well some CUDA support for using GPUs. However, without the MATLAB Parallel Server (formerly named Distributed Compute Server, there are limits on the number of workers that can be created, as well as that all workers must be on the same node.
  2. MATLAB Parallel Server (known as Distributed Computing Server (MDCS) prior to 2019): This extends MATLAB desktop workflows to the cluster hardware, and allows you to submit MATLAB jobs to the cluster without having to learn anything about the cluster command line interface.

In addition, some of the built-in linear algebra and numerical functions are multithreaded as well.

WARNING
If you are running Matlab jobs on one of the Deepthought or Juggernaut high-performance computing clusters, please include a #SBATCH -L matlab directive near the top of your job script. (This is NOT needed for Matlab DSC jobs). This is because we have been having issues with HPC users depleting the campus Matlab license pool. The above directive will ask Slurm for a matlab license, which will be used to throttle the number of simultaneous Matlab jobs running on the clusters. If all the matlab users on the cluster abide by this policy, hopefully there will be no more issues with license depletion. If such an issue occurs, we will regrettably have to kill some matlab jobs (starting with those that did NOT request a license) to free up licenses. We are hoping in the next several months to obtain a truly unlimited matlab license on campus, but until then we ask that HPC users include the above directive in their matlab jobs.

Built-in multithreaded functions

A number of the Matlab built-in functions, especially linear algebra and numerical functions, are multithreaded and will automatically parallelize in that way.

This parallelization is shared memory, via threads, and so is restricted to within a single compute node. So normally your job submission scripts should explicitly specify that you want all your cores on a single node.

For example, if your matlab code is in the file myjob.m, you might use a job submissions script like:

#!/bin/bash
#SBATCH -t 2:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH -mem-per-cpu 1024
#SBATCH -L matlab

. ~/.profile
module load matlab

matlab -nodisplay -nosplash -r "run('myjob.m'); exit" > myjob.out

and your matlab script should contain the line

	maxNumCompThreads(12);
somewhere near the beginning. This restricts Matlab to the requested number of cores --- if it is omitted, Matlab will try to use all cores on the node.

MATLAB Parallel Toolbox

The MATLAB Parallel Toolbox allows you to parallelize your MATLAB jobs, to take advantage of multiple CPUs on either your desktop or on an HPC cluster. This toolbox provides parallel-optimized built-in MATLAB functions, including the parfor parallel loop command.

A simple example matlab script would be


% Allocate a pool
% We use the default pool, which will consist of all cores on your current
% node (up to 12 for MATLABs before R2014a)
parpool
% For MATLAB versions before R2013b, use "matlabpool open"


%Pre-allocate a vector
A = zeros(1,100000)
xfactor = 1/100;

% Assign values in a parallel for loop
parfor i = 1:length(A)
	A(i) = xfactor*i*sin(xfactor*i);
end

Assuming the above MATLAB script is in a file ptest1.m in the directory /lustre/payerle/matlab-tests, we can submit it with the following script to sbatch:

#!/bin/tcsh
#SBATCH -n 20
#SBATCH -N 1
#SBATCH -L matlab

module load matlab

matlab -nodisplay -nosplash \\
	-r "run('/lustre/payerle/matlab-tests/ptest1.m'); exit" \\
	> /lustre/payerle/matlab-tests/ptest1.out

You would probably want to add directives to specify other job submission paremeters, including

NOTE: It is important that you specify a single node in all of the above, as without using Matlab Parallel Server/Distributed Computing Server the parallelization above is restricted to a single node.

MATLAB Parallel Server/Distributed Computing Server

The MATLAB Parallel Server (known as Distributed Computing Server (MDCS) before 2019) allows you to extend your MATLAB workflows from your desktop to an High Performance Computing (HPC) cluster without having to learn the details of submitting jobs to the cluster. This tool allows you to run Matlab on your desktop workstation and submit jobs from that Matlab session to the Deepthought2 or Juggernaut HPC cluster to run on its compute nodes, thereby reducing the need to directly interact with the Unix environment on the HPC clusters. Parallel Server/MDCS works with the Parallel Computing Toolbox discussed above, and extends the functionality to allow for jobs spanning multiple compute nodes.

NOTE: The MATLAB Parallel Server/Distributed Computing Server is currently only available on the Deepthought2 and Juggernaut clusters. It is NOT currently available on MARCC/Bluecrab

This section is divided into several subsections:

  1. Instructions on configuring Parallel Server/MDCS
  2. Instructions on using Parallel Server/MDCS
  3. Links to additional information

Instructions on configuring MATLAB Parallel Server/Distributed Computing Server (MDCS)

Because Parallel Server/MDCS allows for one to submit jobs on the UMD HPC clusters right from your desktop workstation, some configuration of Matlab on your workstation is required. This section will discuss the configuration process, starting with the UMD provided configuration scripts and profiles, and then a more general discussion.

This configuration will need to be done once on your workstation before you can use Parallel Server/MDCS. If you are running Parallel Server/MDCS from multiple workstations, you will need to perform this configuration once per workstation. You will also need to redo the configuration if you upgrade the version of Matlab on your workstation. But otherwise, it is an one-time configuration process (although it does not hurt to redo the configuration process).

In all cases, note that you MUST be running the same release of Matlab on your desktop and on the cluster. We have over half a dozen Matlab versions available on the cluster, and generally update at least once a year. If you need a newer version than is available on the cluster, please contact us and we will work on adding the newer version to the cluster. If you have an old version that is not supported on the cluster, please upgrade the version on your workstation.

Configuring Parallel Server/MDCS using UMD configuration scripts/profiles

The UMD setup for Parallel Server/MDCS is distributed in two parts. The main part is a zip file or tarball containing the actual scripts to integrate Parallel Server/MDCS with the Slurm scheduler on HPC cluster. Although these are mostly independent of the version of Matlab that is being run, there was a change in Matlab between versions R2016b and R2017a, and so there are two sets of files for before (versions 1.x) and after (versions 2.x) this change. Also, some changes were made in R2019a that requires use of version 2.1.0 for R2019a and later version (version 2.1.0 is backwards compatible with R2017a-R2018b). In each case (R2016b or before, or R2017a or later), we provide both zip and tar files--- both files contain the same scripts, and you only need one; the two formats are just provided for your convenience. Windows users will likely prefer zip files; Unix users will likely prefer tarballs. The second part is a small file containing the profile settings --- this is dependent on the Matlab version and cluster you wish to connect to.

A note on versioning: Over the years, there have been a number of tweaks to these scripts. The original version did not have any version number associated with it, which could lead to confusion as the settings files sometimes require a specific version of the scripts. Starting with version 1.1.0 of the scripts, we have added a text file called README.Glue.MDCS-Slurm-Integration containing some information about the scripts and a version number.

Currently we have the following versions (and release dates):

  1. Version 0.1: Released Fall 2014. Oldest version, no longer supported.
  2. Version 1.1.0: Released 30 Mar 2018. Deprecated.
  3. Version 1.2.0: Released 20 Dec 2018. Supports Matlab version up to R2016b
  4. Version 2.0.0: Released 20 Dec 2018. Deprecated. Supports Matlab version R2017a through 2018b
  5. Version 2.1.0: Released 26 Jun 2019. Supports Matlab version R2017a and later (tarball/zipfile renamed from umd_deepthought2 to umd_mdcs)

The current versions of the scripts files are listed below. Note: The *.zip and *.tar.gz files have the same contents; you generally should just download the one which is most convenient for your system. Windows users will likely want the zip file, Unix users likely the tarball.

You will need to unzip/untar the contents of the above integration scripts zipfile/tarball into a directory in your Matlab path on your workstation. To find out what directories are in your Matlab user path, you can issue the command userpath from within Matlab. Typically, this will be one of:

  • My Documents\MATLAB or Documents\MATLAB on Windows systems, or
  • ~/Documents/MATLAB or $matlab/toolbox/local on Linux systems.

If you are reinstalling the scripts (e.g. you are installing a newer version of the scripts and/or upgraded Matlab version), please delete (or at least rename) all of the old versions. You should check all of the directories in the userpath above. In particular, make sure that you delete:

  • configCluster.m
  • ClusterInfo.m
  • The +profiles/+umd/+deepthought2 and +profiles\+umd\+deepthought2 directories (unless you added other profiles for other clusters, it should be safe to just delete the whole +profiles directory tree).
  • The UMD-Slurm-Integration-Scripts directory.

To confirm that you successfully deleted the old version, you can restart Matlab and make sure that the command configCluster returns an Undefined function or variable error. You should also find the Create and Manage Clusters menu entry under the Parallel menu bar and delete any old UMD cluster profiles (you should NOT delete the local or Matlab Parallel Cloud profiles if present).

Once you determine which directory in your Matlab userpath you wish to use, you should unzip/untar the zipfile/tarball into that directory. The configCluster.m should be at the top level of that directory. You will also need to place a profile settings file in this same directory. For version 2.x of the scripts, there is a directory UMD-Slurm-Integration-Scripts placed in this directory as well. If you really wish to, you can move that directory elsewhere (but then you will need to enter the path to that new directory in the configCluster script discussed below).

In addition to the integration scripts in the zipfile or tarball above, you will need a profile settings file. You should choose the version that matches the release of Matlab you are running on your workstation. If the Matlab release on your workstation is newer than any releases listed below, please contact us and we will work to install the new release of Matlab on the cluster. If the version of Matlab you are using is not listed below, but a newer one is, then please upgrade your Matlab. (Depending on your browser, you probably need to do something like right click on the link and use "Save link as ..." to save these to a file):

NOTE:The settings files for Matlab releases R2014a through R2016b require 1.x versions of the integration scripts zipfile/tarball.

Once you have downloaded one or more of the settings files above, you need to place them in the same directory you unzipped/untarred the integration scripts. I.e., your deepthought2_remote_r20*.settings or juggernaut_remote_r20*.settings files must be in the same directory as the configCluster.m script.

WARNING
NOTE: There was an issue in the MDSC-Slurm integration scripts which manifested itself in the 2016a, 2016b and 2017a releases of Matlab. If you are trying to use Matlab Parallel Server/MDCS with Matlab version 2016a or newer, you will need to use a newer version of the scripts in the umd_deepthought2.tar.gz or umd_deepthought2.zip files (version 1.1.0/30 March 2018 or later).

This is an insidious issue: with the older versions of the integration scripts, a multicore job will be submitted to the HPC cluster, but Matlab will only avail itself of a single core. Thus without the newer, fixed, version, your Matlab job will run but poorly and waste CPU and SUs. If in doubt, please upgrade.

You do NOT need to update the profile settings files (they are not impacted by this issue).

If you upgrade the version of Matlab on your workstation, you will need to fetch the corresponding settings file for the new Matlab version. If you are upgrading from a Matlab version before R2017a to R2017a or later, you will need to delete the integration scripts and install the new version of the integration scripts from the zip file/tarball as described above. If you are only upgrading from and to versions older than 2017a, or from and to versions 2017a or newer, then reinstalling the integration scripts from the zipfile/tarball is probably not strictly needed. However, if a newer version of the integration scripts is available, it is recommended that you update.

Once the settings file and the files from the zip file/tarball are installed, you need to configure things. The configCluster.m script, included in the zipfile/tarball, tries to make this easier. It will prompt you for various configuration parameters (e.g. how to authenticate to the cluster) and save these to make things easier when you start using Parallel Server/MDCS. You should only need to run this script once per workstation the first time you wish to use Parallel Server/MDCS with that version of Matlab for a specific cluster. (If you upgrade the Matlab version on your workstation, you will need to run configCluster again.) However, it is safe to run this multiple times if you wish to change parameters; only note that any settings you made in ClusterInfo will be lost if you rerun the configCluster command.

  1. From the Matlab command prompt, type configCluster. This will print the version of the configCluster command being used, and search for all profiles matching the version of Matlab you are running. If there are multiple matching profiles (i.e. you have profiles for multiple clusters), it will prompt you as to which one to use. If only one profile is found, it assumes that you want to use it. If for some reason it cannot find any profiles, it will list all the appropriate settings files it can find and ask your help in choosing one. But usually this means you forgot to install the correct settings file.
  2. The version of configCluster for Matlab R2017a and later (version 2.x of the configCluster.m code) will then ask for the location of the Slurm integration scripts. Unless you moved the UMD-Slurm-Integration-Scripts subdirectory after unzipping or untarring the zipfile/tarball, you can just hit return for the default. If you did move the directory, give the path that you moved it to.
  3. Next, the script will prompt for your username on the cluster. Enter your username. Remember that your username is all lowercase.
  4. Next, the script will ask how you wish to authenticate. There are three options:
    • password: If you select this, the first time you use Parallel Server/MDCS in any Matlab session, Matlab will prompt you to enter your password on the cluster. (It will remember the password for the remainder of that Matlab session, so you only need to enter once per session). This is probably the simplest option, and it is recommended for beginning users.
    • identity: If you select this, you will use an RSA public key authentication. This requires additional setup on both your workstation and the cluster to use. The configCluster will ask some more questions if this option is chosen. Although additional setup is required, once done, it makes passwordless authentication between your workstation and the cluster possible, and is possibly the most convenient option for advanced users.
    • ask: If you select this, the choice between password and identity file authentication is deferred until you actually use Parallel Server/MDCS. This mimics the behavior of older versions of the integration scripts, and will prompt if you want to use an identity file whenever you use Parallel Server/MDCS. I only recommend this option if you are experimenting with setting up the identity key authentication but are not sure if you know how to do it properly, as this does not lock you in. Once you figure out how to do it, I recommend rerunning configCluster and chosing identity.
  5. If you selected to use identity file based authentication, you will need to have setup a private-public key pair and enable SSH public-key authentication to the HPC cluster using the private key created. The configCluster will ask you at this time for the path to the private key "identity file" corresponding to the public key you authorized on the HPC cluster. It will also ask you if the private key file is passphrase encrypted. Encrypting the private key is strongly recommended for security reasons --- otherwise anyone who gets access to the private key file can access the cluster as you. If you encrypted the private key, type 'y' and you will be prompted by Matlab for the passphrase once per Matlab session. Otherwise type 'n'.
  6. The script will then proceed to create a profile named Deepthought2 Remote MATLAB_VERSION, (or Juggernaut Remove MATLAB_VERSION for the Juggernaut cluster) where MATLAB_VERSION will reflect the version of Matlab being run (e.g. R2018b), and make it the default profile. If a profile of that name existed previously, it will be overwritten. It also resets the ClusterInfo structure, setting values as appropriate based on the responses you gave.

To make full use of Matlab DCS, you might also need to open some ports in your workstation firewall. The parpool and related functionality need to be able to communicate with your workstation. The range of ports required to be opened for this to work is complicated, but typically I find enabling the TCP ports 27370 through 27470 should work for most if not all Matlab versions. These ports should be allowed for the subnet containing the Deepthought2 compute nodes (10.103.128.0/19).

You should now be ready to use Matlab DCS. You can open Create and Manage Clusters in the Parallel dropdown and see your new profile, and you can "validate" the profile if you wish. Note that validating the profile submits several jobs to the cluster, requesting up to the number of workers you selected on the validation page (in newer Matlabs; older versions do not give this option) or if not selected, the number of workers available to the cluster (NumWorkers) in the profile definition. As this number does not appear to limit the number of workers you can use in real production work, we normally set to 20 so that you validation does not wait forever in the queue. But be aware that even so it can take a while for all the validation jobs to run.

Manually configuring Parallel Server/MDCS

The Division of Information Technology and the High Performance Computing group do not officially support manual configuration of Parallel Server/MDCS. Only the above scripts are officially supported. However, we provide below some basic pointers should you want to tweak things. We even limit that to Matlab versions R2017a and later (earlier versions use a different profile structure which we do not cover).

There are basically two parts to the configuration, somewhat distinct but with many interconnections. The first part are a bunch of scripts (mostly Matlab *.m scripts, but a couple of Bourne shell scripts), basically everything in the umd_deepthought2.*.zip or equivalent tarball except for configCluster.m. The Matlab scripts mostly define functions that are needed by the DCS and Parallel Computing Toolbox codes for interacting with the batch scheduler, transferring files, etc. There also are basic Bourne shell scripts which are used for submission to the Slurm sbatch command. We use tweaked versions of the standard Slurm integration scripts from MathWorks. We use the nonshared variant since we do not expect workstations to have access to the standard Slurm commands or a shared filesystem with the cluster. The tweaks mostly apply to having the ClusterInfo structure save information about the desired authentication parameters to make things more user friendly.

The various scripts are described in MathWorks documentation, but basically include:

  • ClusterInfo.m: this defines a structure allowing one to pass additional parameters for jobs (like allocation account, walltime, etc).
  • communicatingJobWrapper.sh: this is the job script used for parallel jobs
  • independentJobWrapper.sh: this is the job script used for serial jobs
  • cancelJobFcn.m: defines function to cancel a Slurm job
  • deleteJobFcn.m: defines function to delete a Slurm job
  • getJobStateFcn.m: defines function to determine state of a Slurm job
  • communicatingSubmitFcn.m: defines function to submit a parallel job
  • independentSubmitFcn.m: defines function to submit a serial job
  • extractJobId.m: defines function to extract job ids from Slurm output
  • createSubmitScript.m: defines function to create job scripts
  • getSubmitString.m: defines function to generate the actual sbatch command and arguments
  • getCommonSubmitArgs.m: defines function to generate some of the arguments to sbatch
  • getRemoteConnection.m: defines function to get connection to the cluster

The ClusterInfo.m script must reside somewhere in your Matlab userpath. The others all go in a directory of your choosing, but which is referenced in the profile. You probably will need to take either the UMD or the Mathworks variants of the above.

The other part of the configuration is the actual cluster profile. Again, we use a generic scheduler profile because it is expected that the Slurm commands are not available on your workstation. The MathWorks documentation covers the configuration of the profile in more detail, but basically:

  • JobStorageLocation: this is the path on the local system where job data should be stored.
  • NumWorkers: I believe that this is simply the default number of workers to use when validating the profile. I generally use something like 20 to minimize the resources requested during validation (so that validation does not take forever).
  • ClusterMatlabRoot: this defines the root of the Matlab installation on the cluster. I.e., it should be the value of the environmental variable MATLAB_ROOT on the cluster after you run module load matlab/MATLAB_VERSION. Remember that you must use the same version of Matlab as is running on your workstation.
  • RequiresMathWorksHostedLicensing: should be false on our clusters.
  • LicenseNumber: leave unset on our clusters.
  • OperatingSystem: should be set to 'unix' (the OS running on the Cluster, not your workstation)
  • HasSharedFilesystem: should be false, as your workstation does not share a filesystem with the cluster
  • IntegrationScriptsLocation: this defines where the integration scripts above are located
  • AdditionalProperties: this defines a structure with additional properties for the scheduler. It must define:
    • ClusterHost: the login host for the cluster, e.g. login.deepthought2.umd.edu
    • RemoteJobStorageLocation: this is the path on the cluster where job data should be stored. You need to have write access to the directory specified. We normally use something under your lustre directory.

The following URLs contain additional information you might find useful:

Using Matlab DCS with the DT2 Cluster

The following is a quick guide to using Matlab DCS to submit jobs to the DT2 cluster. It is assumed that you have already done the setup scripts above on your workstation at some point (downloading and extracting the integration scripts to the appropriate location, downloading the correct profile to the appropriate location, and running configCluster). These steps should only need to be done once per user per workstation (although you will need to repeat at least getting the new profile and rerunning configCluster if/when you upgrade Matlab).

  1. First, you need to define a "cluster" to submit jobs to. This holds the information about the parallel workers, etc. For most cases, it will suffice to enter a command like:
  2. >> c = parcluster;

    You can choose whatever variable you like instead of c, but if so be sure to change it in the following examples as well.

  3. You will generally need to define additional parameters (e.g. WallTime, etc) needed for your job. This is done via the ClusterInfo object in Matlab and is discussed further below. The settings are "sticky", so you probably will not need to update them very often. For this simple example, the default values will work so you can skip this step.
  4. You can then create and submit jobs to be run on the remote cluster. The following is a simple example:
    >> j = c.batch(@pwd, 1, {} );
    >> j.wait
    >> j.fetchOutputs(:)
    
    ans =
    
    /a/fs-3/export/home/deepthought2/mltrain
    
    >>
    >> j.delete
    >>

    The variable j holds the "job"; you can use whatever variable you like. In this case, the "job" is created when we create a batch job on our parcluster c. For this example, we are simply running the builtin pwd command; in most cases you would probably be including a string with the name of an user defined function (e.g. the name of a "*.m" file without the ".m" extension). The 1 in the batch command means that the function is expected to return 1 argument. The braces {} contain a list of input values to the function; in this case, the pwd does not take input argument, so we do not provide any.

    The first time you submit a job to the HPC cluster in a particular Matlab session, a pop-up message will be displayed asking if you wish to "Use an identity file to login to login.deepthought2.umd.edu?" (or whatever the login node for the cluster is). If you answer "No", you will be prompted for your password on the HPC cluster, and this is the recommended response for new users. Answering "Yes" requires one to setup RSA public key authentication on the HPC login nodes; you will be prompted to provide the location of the identity file and asked if the file requires a passphrase. In all cases, Matlab will remember this information (your password, or the location and/or passphrase to the identity file) for the remainder of your Matlab session.

    When you issue the batch, a job is submitted to the scheduler to run on the HPC clsuter compute nodes. Depending on how busy the cluster is, the job might or might not start immediately, and even if it starts immediately, it will generally (except in overly simple test cases such as this) take a while to run. The j.wait will not return until the job is completed. You might instead wish to use the c.Jobs command to see the state of all of your jobs. Although you can submit other jobs (be sure to store the job in different variables) and perform other calculations while your job(s) are pending/running, you cannot examine their output until they complete.

    To examine the results of a job (after it has completed), you can use the j.fetchOutputs(:) function as shown in the example. In the above example, you can see that it returned the path to the home directory of the Matlab test login account that it was run from. If the job does not finish successfully, you probably will not be able to get anything useful from the fetchOutputs function. In such cases, you should look at the error logs (which can be lengthy) using the getDebugLog function. There are separate logs for each worker in the job, so you will need to do something like:

    >> j.Parent.getDebugLog(j.Tasks(1))

    Note: The fetchOutputs function will only return the values returned from the function you called; data which has been written to files will not be returned. For such data, you will need to manually log into the HPC cluster to retrieve the information.

    The above example is unrealistically simple. In practice, you will generally need to set some more job parameters --- although Matlab DCS hides some of the complexity of submitting jobs to an HPC cluster from the user, it cannot hide all of it. In general, the settings for your job will be obtained from the ClusterInfo object in Matlab. You can use the command ClusterInfo.state() to see all of the current settings, and in general the commands ClusterInfo.getFOO() and ClusterInfo.setFOO(VALUE) can be used to query the value of a particular setting FOO, or set such to VALUE. Notable fields are:

    • WallTime: this sets the maximum wall time for the job. If not set, the default is 15 minutes, which is probably too short for real jobs. This can be given using one of the following formats:
      • MINUTES
      • DAYS-HOURS:MINUTES
      • DAYS-HOURS
      • HOURS:MINUTES:SECONDS
    • MemUsage: this sets the memory per CPU-core/task to be reserved. This should be given as a number of MB per core.
    • ProjectName: this specifies the allocation account to which the job will be charged. Your default allocation account will be charged if none is specified.
    • QueueName: this specifies the partition the job should run on. Normally you will not wish to set this unless you wish to run on the debug or scavenger partitions.
    • UseGpu: If you wish for your job to use GPUs, you should set this to the number of GPUs to use. That will cause Slurm to schedule your job on a node with GPUs; additional work may be needed to get Matlab to actually use the GPUs.
    • GpusPerNode: If you set UseGpu, the system will by default request a single GPU per node. You can set GpusPerNode to the number of GPUs you wish to request per node. Obviously, on Deepthought2 the only value that makes sense is 2 (as 1 is default, and at most 2 GPUs are avaialble per node).
    • EmailAddress: if set, it will cause Slurm to send email to the address provided on all job state changes. The default is not to send any email.
    • Reservation: if set, the job will use the specified reservation.
    • UserDefinedOptions: This is a catch-all for any other options you need to provide to Slurm for your job. You should just present sbatch flags as you would on the command line. E.g., to specify that you wish to allow other jobs to run on the same node as your job, you can provide the value --share. You can provide multiple Slurm arguments in this string by just putting spaces between the arguments in the string.
    • The following example shows how to set a walltime of 4 hours and request 4 GB/core (4096 MB/core):

      >> ClusterInfo.setWallTime('4:00:00')
      >> ClusterInfo.setMemUsage('4096')

      Links with additional information related to MATLAB Parallel Server/Distributed Computing Server (MDCS)

      The sections above should provide some basics to help you get started with MATLAB Parallel Server/Distributed Computing Server (MDCS), but a comprehensive discussion is well beyond the scope of this page. We provide here a few links to more information on Parallel Server/MDCS for your convenience.

      In the fall of 2014, we had a tutorial on MDCS led by an instructor from MathWorks. The documentation from that is provided below --- although it is a little dated and there have been some minor changes since then, the basic concepts might still be useful.

      MathWorks also has a significant amount of web documentation on Parallel Server/MDCS available at https://www.mathworks.com/help/mdce/.

      Installing add-ons/packages/etc

      The campus MATLAB license includes a fair number of licensed toolboxes. However, there are also has a large number of free and community provided toolboxes --- far too many for the Division of Information Technology to install all of them. For the most part, any individual toolkit/toolbox/package add-on is only used by at most a handful of people, so it is more efficient for the users to install these themselves.

      This is relatively simple to do in the more recent MATLAB versions; from your main MATLAB screen, click on the "Add-Ons" drop down, and select "Get Add-ons". This might take a little while to open up due to the large number of add-ons available, but once open there are a number of ways to look for add-ons. If you know what add-on you want, the search bar on the top right might be the easiest way to find the add-on. Find the add-on you desire and click on it.

      Once the window for the particular add-on opens, there should be a button labeled "Install" in the upper right. Click on that, and the add-on should be installed into the appropriate location in your home directory.

      You will likely need an account with Mathworks/Matlab in order to download the add-ons. You can create such an account at https://www.mathworks.com/mwaccount/register; it is advised that you register with your "@umd.edu" email address to get the full benefits of your association with the University.

      External Resources

      For your convenience, we provide some links to some non-University resources about MATLAB which you might find useful.

      Tutorials from Mathworks

      These are free tutorials from Mathworks, the company which produces MATLAB.