Differences between the new Zaratan and old Deepthought2 HPC clusters

On this page we document some of the major differences you may encounter when going from the old Deepthought2 cluster to the new Zaratan cluster. The intent is to assist users of Deepthought2 and other previous UMD clusters to quickly get up to speed on the new cluster.

There are a number of changes we are making with the Zaratan cluster, and although we believe these will make for an overall better experience for all HPC users, we wanted to enumerate them to help users with the transition. To help you navigate, we have divided the changes in the categories below (with some changes being listed in multiple categories).

Major Differences
Hardware Differences
Storage Differences
Login/Access/Accounting Differences
Environment Differences
Differences pertaining to the Software Library

Major Differences

This is just an overview of the differences we consider most signficant, for users who do not have time to review the full list. We encourage all users of the Deepthought2 cluster who are transitioning to Zaratan to try to find time to look over the full list, but until you have such time here are what we believe are the most critical differences. (Note: we will update this list if we receive a lot of questions on items not in it, so it might be worth returning to this page if you run into issues).

Upgraded hardware
- AMD instead of Intel chips
- Modern GPUs
SHELL (medium-term) storage tier
Storage quotas
Multifactor authentication required
Changes to allocation structure/policy
The default login shell has changed to bash

Hardware Differences

Of course, the hardware on the Zaratan cluster is significantly newer than what was available on the old Deepthought2 cluster. The differences likely to have the most impact to users are:

AMD chipset: The Zaratan cluster will consist of AMD CPUs when it debuts, as opposed to the Intel CPUs used in the Deepthought2 cluster. This choice was made to maximize the performance per unit cost. This might require users to change some flags when compiling codes in order to get the best optimizations.
Increased number of cores per node: The standard compute nodes on the Zaratan cluster have dual AMD Zen3 7763 CPUs, with 64 cores per CPU for a total of 128 cores per node , whereas the Deepthought2 standard compute nodes had dual Intel IvyBridge 2680v2 CPUs with 10 cores per CPU, for a total of 20 cores per node. So the Zaratan nodes will have over 6 times the number of cores of a standard Deepthought2 compute node. This means that there are possibilities for increased scaling of shared memory parallelization schemes (e.g. OpenMP , TBB , and other thread based paradigms ). But it also means that users must take care when submitting jobs so that you do not accidentally reserve an entire node when you only need a handful of cores. We will be changing the default exclude mode policy so that most jobs will default to being shared.
Memory per core: The standard compute nodes on Zaratan will have 512 GB of RAM. While this means that jobs that do not need to use all of cores on a node can use significantly more memory than on Deepthought2 (with only 128 GB of RAM per standard compute node), the average memory per core is only 4 GB /core, as opposed to a bit over 6 GB/core on Deepthought2. Also note that the charging mechanism on Zaratan has been modified so that jobs requesting more than the average memory per CPU core will be charged additionally --- this is to prevent unfair situations like a job requesting 1 CPU core and 500 GB, thereby effectively monopolizing the node, but only being charged for 1/128 of a node.
GPUs: The Zaratan cluster includes 20 nodes each with four Nvidia A100 Tensor Core GPUs (using the Ampere architecture supporting CUDA compute capability 8.0. These are much more powerful than the K20 GPUs available on Deepthought2. Indeed, because the new GPUs might be more powerful than many jobs require, we are looking into splitting some of the GPUs into multiple, smaller GPUs (using NVIDIA Multi-Instance GPU technology) to significantly increase the number of GPUs available on the cluster. Some of the GPUs will remain unsplit for jobs requiring the full power of the A100 GPUs --- we will be adjusting the ratio of split/unsplit GPUs to maximum the utilization of the GPUs as we observe the jobs being sent to them.
To better utilize the GPUs, we have created a partition for GPU jobs. You will still want to specify a GPU GRES when you submit a job, and will be be creating new GPU model specific GRESes to allow jobs to specify exactly what GPU is required. This will allow one to distinguish between the split and full A100s, as well as any future GPU models added to the cluster.
Non infiniband nodes: The Zaratan cluster also includes about 19 nodes each with dual AMD Zen3 Epyc 7502 CPUs, with 32 core per CPU for a total of 64 cores per node and 1 TB of RAM per node (or 16 GB/core). These nodes do not have high speed infiniband interconnects, only dual 25 Gb/sec ethernet. As such, we are targeting these for jobs which do not require the high bandwidth and low latency of the standard (infiniband) compute nodes. This would include sequential jobs and other high-throughput computing and similar jobs which will fit onto a single node. These nodes also have massive amounts of local flash storage which could be useful in some data intensive jobs. These nodes will be accessible using the 'nonib' queue.
Interconnects: The networking interconnects between most of the nodes are in general significantly faster than those on Deepthought2. The standard compute nodes have HDR100 infiniband interconnects, running at 100 Gb/sec. The fileservers and GPU nodes have full HDR interconnects, for 200 Gb/sec. The sequential nodes do not have infiniband, so they use 10 Gb/sec ethernet.

Storage Differences

High Performance File System/Scratch Storage: The high performance filesystem on Zaratan will be 2 PB in size, double the size of what was available on Deepthought2. The high performance filesystem on Zaratan will be using BeeGFS for the underlying filesystem.

SHELL (Medium Term) Storage: Zaratan will include some 10 PB of SHELL medium-term storage. This storage space is intended for storage of important files and data which is part of ongoing research but is not being actively used by jobs running on the cluster. It will be accessible from the login nodes, but not accessible from compute nodes. The SHELL storage system leverages the Auristor File System, basically an enhanced version of the AFS filesystem which has long been used at UMD. This is a global file system with clients available for all modern operating systems, which makes the content securely accessible (secured by Kerberos credentials) wherever it is needed. For more information, see our SHELL documentation.

Project Directories: On Deepthought2, users had individual work directories directly under the root of the high performance filesystem (HPFS), e.g. /lustre/job. On Zaratan, storage will generally be organized by projects. Users will have a single home directory regardless of how many projects/allocations they belong to. Users will also have a personal directory underneath their project's scratch and/or shell storage directory (assuming the project has such resources). Symlinks will be created (scratch-$project and shell-$project) in the users home directory to these personal directories.

NOTE: All data stored by users on the UMD maintained HPC systems is considered to belong to the principal investigator (PI) responsible for the allocation/project. Do not store any information on these clusters which is not to be shared with your PI.

Storage Quotas All storage systems on Zaratan will have enforced quotas, and will be attached to projects. Projects and allocations will receive quotas for the scratch and medium term storage resources on creation of the allocation/project, and are only valid for the duration of the respective allocation/project. Note that there are separate quotas for the scratch and shell storage systems. These storage quotas are evaluated and assigned in similar fashion to CPU time --- for allocations from the AAC, this requires the requestor to specify and justify their storage needs to the satisfaction of the committee. As with CPU time, the more that is requested the better justification is required, and after a certain threshold, payment will be required.
The quotas for the project apply to the combined usage of all members of the project. Additional quotas may be applied to usage by individual members of the project --- these can be adjusted on request of the PI (as long as the combined usage remains within the project limit).

NOTE: Even though the storage systems have quotas enforced, users are still expected adhere to University and HPC cluster policies regarding the use of storage resources. This includes moving or deleting data on the scratch filesystem which is no longer needed for active jobs/research on the cluster, and compressing data on the medium term storage tier when it is not going to be accessed for a while. We will verify proper use of the disk resources before any requests for additional storage will be considered, as well as possible spot checks.

Login/Access/Accounting Differences

With Zaratan, we are introducing a number of changes related to accessing the system and accounting.

Logging into the system: To access the login nodes on Zaratan, ssh to login.zaratan.umd.edu.
Multifactor Authentication: Access to the Zaratan cluster will require multi-factor authentication, using the standard campus DUO MFA system. Since the standard campus VPN does multifactor authentication, if you are on the VPN, ssh connections to the login nodes will not prompt you for multifactor. But if you are not on campus VPN, when you ssh to the Zaratan login nodes, you will first be prompted for your password and then prompted to enter a passcode (or a single digit for a "push"). See the main documentation about MFA logins to the cluster.
Allocation Re-structuring: In order to make HPC at UMD more self-sustaining financially, we are (at the behest of the provost's office) implementing some changes in the process for creating allocations. Requests for new allocations and the renewal of existing allocations are still made by the same application form to the Allocations and Advisory Committee (AAC), but the allocation levels have changed. All faculty are eligible to receive a basic allocation, with additional compute time available from the AAC, up to some threshold. If your computing needs exceed this threshold, it can be purchased --- monies from such purchases will be used for upgrading the cluster. These are all renewable annually. All of these requests can be made via the aforementioned form; only minimal information is needed for the basic allocation, with greater information needed as more compute time is requested. Note that since storage is now quota-ed, allocations will now also include (and allocation requests will need to request and justify) storage quotas on the high performance/scratch and SHELL/medium-term storage systems. Please see the following page on allocation levels and costs for more detailed information, including pricing information that can be included in grants.
In addition, some colleges (e.g. CMNS and Engineering) may have their own pools of compute time which they have paid for and which they may award to their faculty. The allotment of these college level pools is up to the colleges.
Allocations will now consist of a single allocation account, instead of the two-tier (standard and hi-priority allocation accounts) system used on Deepthought2. The non-paid allocations from the AAC will be allotted annually; paid allocations will be alloted quarterly.
Allocation Management: We are introducting a ColdFront portal which faculty members can use to review and manage their allocations. Faculty members who have received an allocation on any of the DIT maintained clusters will be able to login to the portal and see information about their project and allocations. PIs can see current usage statistics (both for compute and storage), and see which people have access to their allocation, as well as manage access. Users of the cluster still must be in the campus LDAP directory and have active Glue/TerpConnect accounts.

Differences pertaining to the Software Library

Versioning in the Software Library: If you do not specify a version in the module command, you will now generally get the latest installed version of the package requested which is compatible with any previously loaded packages (e.g. compiler, MPI libraries).
Toolchains: We plan to support several 'toolchains' consisting of compilers and basic libraries. We plan to update these toolchains roughly annually, and will give versions based on the year of update. The basic toolchain families will be:
- gnu: This is a toolkit of free/opensource packages based on the GNU Compiler Collection. It includes OpenMPI, OpenBLAS, FFTW, etc.
- intel: This toolkit is based on the legacy Intel compilers (icc, ifort, etc) along with the Intel Math Kernel Library.
- oneapi: This toolkit is based on the new clang Intel compilers (icx, ifx, etc) along with the Intel Math Kernel Library.
- aocc: This toolkit is based on the AMD optimizing compiler (AOCC clang, flang, etc) along with the AMD optimized math libraries
- nvchpc: This toolkit is based on the compilers from the NVidia HPC toolkit (nvc, nvfortran, etc) and their MPI and libraries.
Most installed packages are installed with the gnu toolkit, but a basic set of libraries (BLAS, NetCDF, etc.) are provided for all of the toolkits.

Environment Differences

Change of default shell: The default login shell on the Deepthought2 cluster was tcsh; this was inherited from the legacy Glue/TerpConnect environment which chose csh over the Bourne shell sh as being more user-friendly when the decision was made some 30+ years ago. Currently, the bash shell is at least as user friendly as tcsh, is more commonly used in the linux community, and is much better suited for scripting than the csh variants. As such, we have decided to change the default login shell on Zaratan to bash.
You can still change your login shell to tcsh, etc., but the default will now be bash in most cases. If you have a Glue/TerpConnect/Deepthought2 account and your login shell was something other than csh/tcsh, and the corresponding shell exists on Zaratan, on the creation of your Zaratan account we will set your login shell to match what it was on Glue/TerpConnect/Deepthought2. Otherwise your login shell will be set to bash.

Unfortunately, we cannot distinguish between those users who do not care which login shell they used and/or did not know how to change it and those who actually have a real desire to use the tcsh or csh shells, so if you are in the latter category, your default shell on Zaratan will not but your desired tcsh. Although this could present a good opportunity to re-evaluate your choice of shell, if you really wish to use tcsh, you can change your login shell.
Dot files/startup files: On Deepthought and Deepthought2, it was recommended that you do not edit your .cshrc and .bashrc files, but instead edit .cshrc.mine and/or .bashrc.mine instead. This is no longer the case; feel free to create and edit .cshrc, .bashrc, etc. to customize your environment. Indeed, by default, any customizations you place in a .cshrc.mine or .bashrc.mine, etc, will not be processed on login and so will have no effect.