Skip to main content

r: R statistical computing scripting language

Contents

  1. Overview of package
    1. General usage
  2. Availability of package by cluster
  3. System installed R packages, by cluster/R version
  4. Installing Modules
  5. Running R in batch mode
  6. Using R and MPI

Overview of package

General information about package
Package: r
Description: R statistical computing scripting language
For more information: https://www.r-project.org
Categories:
License: OpenSource (GPL)

General usage information

R is a language and environment for statistical computing and graphics. It is similar to the S language and environment, and can be considered as an open-source implementation of S. There are some important differences, but much code written for S runs unaltered under R.

This module will add the R and Rscript commands to your path.

In case you need to link against this library in your code, the following environmental variables have been defined:

You will probably wish to use these by adding the following flags to your compilation command (e.g. to CFLAGS in your Makefile):

and the following flags to your link command (e.g. LDFLAGS in your Makefile):

Available versions of the package r, by cluster

This section lists the available versions of the package ron the different clusters.

Available versions of r on the Zaratan cluster

Available versions of r on the Zaratan cluster
Version Module tags CPU(s) optimized for GPU ready?
4.3.2 r/4.3.2 icelake, x86_64, zen2 Y
4.1.1 r/4.1.1 zen2 Y

System installed R packages/modules

The system installations of R include a large number of R modules to enhance the capability of R. The table below list those modules and version numbers for the various installations of R which are made available with the "module load r" or similar commands.

Packages/modules for R

The following table lists the various packages/modules/extensions for R along with their version which are installed on the UMD HPC clusters, by cluster, R version, and platform. These are the packages/modules/extensions which are enabled by default when you do the "module load r".
R packages/modules/extensions enabled by default
R Version4.3.24.3.24.3.24.1.1
Compilergcc/11.3.0gcc/11.3.0gcc/11.3.0gcc/9.4.0
CPU optimized forzen2icelakex86_64zen2
CUDAnocudanocudanocudanocuda
abind1.4-51.4-51.4-51.4-5
amap0.8-190.8-190.8-190.8-18
animation2.72.72.72.7
annotate1.78.01.78.01.78.01.68.0
AnnotationDbi1.62.01.62.01.62.01.52.0
ape5.7-15.7-15.7-15.4-1
argparse2.2.22.2.22.2.22.0.3
askpass1.11.11.11.1
assertthat0.2.10.2.10.2.10.2.1
backports1.4.11.4.11.4.11.2.1
base4.3.24.3.24.3.24.1.1
base64enc0.1-30.1-30.1-30.1-3
bayesm3.1-53.1-53.1-53.1-4
Bessel0.6-00.6-00.6-00.6-0
BH1.81.0-11.81.0-11.81.0-11.72.0-3
BiasedUrn2.0.92.0.92.0.91.07
Biobase2.60.02.60.02.60.02.50.0
BiocFileCache2.8.02.8.02.8.01.14.0
BiocGenerics0.46.00.46.00.46.00.40.0
BiocIO1.10.01.10.01.10.01.6.0
BiocManager   1.30.18
BiocParallel1.34.01.34.01.34.01.24.1
BiocVersion   3.14.0
biomaRt2.56.02.56.02.56.02.46.2
Biostrings2.68.02.68.02.68.02.58.0
bit4.0.54.0.54.0.54.0.4
bit644.0.54.0.54.0.54.0.5
bitops1.0-71.0-71.0-71.0-6
blob1.2.41.2.41.2.41.2.1
boot1.3-28.11.3-28.11.3-28.11.3-28
brew1.0-81.0-81.0-81.0-7
brio1.1.31.1.31.1.31.1.0
broom1.0.41.0.41.0.40.8.0
bslib0.4.20.4.20.4.20.3.1
cachem1.0.71.0.71.0.71.0.6
callr3.7.33.7.33.7.33.7.0
caTools1.18.21.18.21.18.21.18.2
cellranger1.1.01.1.01.1.01.1.0
checkmate2.1.02.1.02.1.02.0.0
class7.3-217.3-217.3-21 
classInt0.4-90.4-90.4-9 
cli3.6.13.6.13.6.13.0.1
clipr0.8.00.8.00.8.00.7.1
cluster2.1.42.1.42.1.42.1.0
cmdstanr0.5.30.5.30.5.30.5.1
coda0.19-40.19-40.19-40.19-4
codetools0.2-190.2-190.2-190.2-18
colorspace2.1-02.1-02.1-02.0-0
commonmark1.9.01.9.01.9.01.8.0
compiler4.3.24.3.24.3.24.1.1
conflicted1.2.01.2.01.2.0 
covr3.6.23.6.23.6.23.5.1
cowplot1.1.11.1.11.1.11.1.1
cpp110.4.30.4.30.4.30.4.0
crayon1.5.21.5.21.5.21.4.1
credentials1.3.21.3.21.3.21.3.2
crosstalk1.2.01.2.01.2.01.2.0
ctc1.74.01.74.01.74.01.64.0
ctmm1.2.01.2.0 1.0.0
curl5.0.05.0.05.0.04.3.2
data.table1.14.81.14.81.14.81.14.2
datasets4.3.24.3.24.3.24.1.1
DBI1.1.31.1.31.1.31.1.0
dbplyr2.3.22.3.22.3.22.1.1
DelayedArray0.26.00.26.00.26.00.16.1
deldir1.0-61.0-61.0-61.0-6
DEoptimR1.0-121.0-121.0-121.0-11
desc1.4.21.4.21.4.21.2.0
DESeq21.40.01.40.01.40.01.30.0
devtools   2.3.2
diffobj0.3.50.3.50.3.50.3.3
digest0.6.310.6.310.6.310.6.28
distributional0.3.20.3.20.3.20.3.0
doParallel1.0.171.0.171.0.171.0.17
doSNOW1.0.201.0.201.0.201.0.20
dotCall641.0-21.0-21.0-21.0-1
dplyr1.1.21.1.21.1.21.0.7
dqrng0.3.00.3.00.3.00.3.0
DT0.270.270.270.23
dtplyr1.3.11.3.11.3.11.2.1
dynamicTreeCut1.63-11.63-11.63-11.63-1
e10711.7-131.7-131.7-13 
edgeR3.42.03.42.03.42.03.32.1
ellipsis0.3.20.3.20.3.20.3.2
evaluate0.200.200.200.14
expm0.999-70.999-70.999-70.999-6
fansi1.0.41.0.41.0.40.5.0
farver2.1.12.1.12.1.12.1.0
fastcluster1.2.31.2.31.2.31.1.25
fastmap1.1.11.1.11.1.11.1.0
fasttime1.1-01.1-01.1-01.1-0
fields14.114.114.113.3
filelock1.0.21.0.21.0.2 
findpython1.0.81.0.81.0.81.0.7
fitdistrplus1.1-111.1-111.1-111.1-8
fmesher0.1.20.1.2  
FNN1.1.3.21.1.3.21.1.3.21.1.3.1
fontawesome0.5.10.5.10.5.1 
forcats1.0.01.0.01.0.00.5.1
foreach1.5.21.5.21.5.21.5.2
foreign0.8-840.8-840.8-840.8-81
formatR1.141.141.141.7
Formula1.2-51.2-51.2-51.2-4
fs1.6.21.6.21.6.21.5.0
futile.logger1.4.31.4.31.4.31.4.3
futile.options1.0.11.0.11.0.11.0.1
future1.32.01.32.01.32.01.26.1
future.apply1.10.01.10.01.10.01.9.0
gargle1.4.01.4.01.4.0 
genefilter1.82.01.82.01.82.01.72.1
geneLenDataBase1.36.01.36.01.36.0 
geneplotter1.78.01.78.01.78.01.68.0
generics0.1.30.1.30.1.30.1.1
GenomeInfoDb1.36.01.36.01.36.01.26.2
GenomeInfoDbData1.2.101.2.101.2.101.2.1
GenomicAlignments1.36.01.36.01.36.01.30.0
GenomicFeatures1.52.01.52.01.52.0 
GenomicRanges1.52.01.52.01.52.01.42.0
geomorph4.0.54.0.54.0.54.0.3
geosphere1.5-181.5-181.5-181.5-14
gert1.9.21.9.21.9.21.6.0
ggdendro0.1.230.1.230.1.230.1.23
ggplot23.4.23.4.23.4.23.3.5
ggrepel0.9.30.9.30.9.30.9.1
ggridges0.5.40.5.40.5.40.5.3
gh1.4.01.4.01.4.01.3.0
gitcreds0.1.20.1.20.1.20.1.1
Glimma2.10.02.10.02.10.02.0.0
globals0.16.20.16.20.16.20.15.0
glue1.6.21.6.21.6.21.4.2
Gmedian1.2.71.2.71.2.71.2.7
gmp0.7-10.7-10.7-10.6-5
GO.db3.17.03.17.03.17.03.12.1
goftest1.2-31.2-31.2-31.2-3
googledrive2.1.02.1.02.1.0 
googlesheets41.1.01.1.01.1.0 
GOplot1.0.21.0.21.0.21.0.2
goseq1.52.01.52.01.52.0 
gplots3.1.33.1.33.1.33.1.1
graphics4.3.24.3.24.3.24.1.1
grDevices4.3.24.3.24.3.24.1.1
grid4.3.24.3.24.3.24.1.1
gridExtra2.32.32.32.3
gsl2.1-82.1-82.1-82.1-6
gstat   2.0-9
gtable0.3.30.3.30.3.30.3.0
gtools3.9.43.9.43.9.43.9.2.1
haven2.5.22.5.22.5.22.5.0
hdf5r1.3.81.3.81.3.81.3.5
here1.0.11.0.11.0.11.0.1
highr0.100.100.100.9
Hmisc5.0-15.0-15.0-14.4-2
hms1.1.31.1.31.1.31.1.1
htmlTable2.4.12.4.12.4.12.1.0
htmltools0.5.50.5.50.5.50.5.1.1
htmlwidgets1.6.21.6.21.6.21.5.3
httpuv1.6.91.6.91.6.91.6.5
httr1.4.51.4.51.4.51.4.2
httr20.2.20.2.20.2.2 
hwriter1.3.2.11.3.2.11.3.2.11.3.2.1
ica1.0-31.0-31.0-31.0-2
ids1.0.11.0.11.0.1 
igraph1.4.21.4.21.4.21.2.6
impute1.74.01.74.01.74.01.70.0
ini0.3.10.3.10.3.10.3.1
INLA23.09.0923.09.0923.09.09 
interp1.1-41.1-41.1-4 
intervals0.15.30.15.30.15.30.15.2
IRanges2.34.02.34.02.34.02.24.1
IRdisplay1.11.11.11.1
IRkernel1.3.21.3.21.3.21.3
irlba2.3.5.12.3.5.12.3.5.12.3.5
isoband0.2.70.2.70.2.70.2.3
iterators1.0.141.0.141.0.141.0.14
jpeg0.1-100.1-100.1-100.1-8.1
jquerylib0.1.40.1.40.1.40.1.4
jsonlite1.8.41.8.41.8.41.7.2
KEGGREST1.40.01.40.01.40.0 
kernlab0.9-320.9-320.9-320.9-30
KernSmooth2.23-202.23-202.23-202.23-18
knitr1.421.421.421.33
ks1.14.01.14.01.14.01.13.5
labeling0.4.20.4.20.4.20.4.2
lambda.r1.2.41.2.41.2.41.2.4
later1.3.01.3.01.3.01.3.0
lattice0.21-80.21-80.21-80.20-44
latticeExtra0.6-300.6-300.6-300.6-29
lazyeval0.2.20.2.20.2.20.2.2
leiden0.4.30.4.30.4.30.3.9
lifecycle1.0.31.0.31.0.31.0.1
limma3.56.03.56.03.56.03.52.1
listenv0.9.00.9.00.9.00.8.0
lme41.1-331.1-33 1.1-29
lmtest0.9-400.9-400.9-400.9-40
locfit1.5-9.71.5-9.71.5-9.71.5-9.4
lubridate1.9.21.9.21.9.21.8.0
magick2.7.42.7.42.7.42.7.3
magrittr2.0.32.0.32.0.32.0.1
manipulate1.0.11.0.11.0.11.0.1
manipulateWidget0.11.10.11.10.11.10.11.1
maps3.4.13.4.13.4.13.3.0
markdown1.61.61.61.1
MASS7.3-597.3-597.3-597.3-54
Matrix1.5-41.5-41.5-41.3-4
MatrixGenerics1.12.01.12.01.12.01.2.1
matrixStats0.63.00.63.00.63.00.61.0
mclust6.0.06.0.06.0.05.4.10
memoise2.0.12.0.12.0.11.1.0
methods4.3.24.3.24.3.24.1.1
mgcv1.8-421.8-421.8-421.8-38
mime0.120.120.120.11
miniUI0.1.1.10.1.1.10.1.1.10.1.1.1
minqa1.2.51.2.51.2.51.2.4
misc3d0.9-10.9-10.9-10.9-1
mnormt2.1.12.1.12.1.12.0.2
modelr0.1.110.1.110.1.110.1.8
multicool0.1-120.1-120.1-120.1-12
munsell0.5.00.5.00.5.00.5.0
mvtnorm1.1-31.1-31.1-31.1-3
ncdf41.211.211.211.17
nlme3.1-1623.1-1623.1-1623.1-153
nloptr2.0.32.0.3 2.0.3
nnet7.3-187.3-187.3-187.3-14
numDeriv2016.8-1.12016.8-1.12016.8-1.12016.8-1.1
openssl2.0.62.0.62.0.61.4.5
parallel4.3.24.3.24.3.24.1.1
parallelly1.35.01.35.01.35.01.31.1
parsedate1.3.11.3.1  
patchwork1.1.21.1.21.1.21.1.1
pbapply1.7-01.7-01.7-01.5-0
pbdZMQ0.3-90.3-90.3-90.3-7
pbivnorm0.6.00.6.00.6.00.6.0
pbkrtest0.5.20.5.2 0.5.1
pillar1.9.01.9.01.9.01.6.4
pkgbuild1.4.01.4.01.4.01.2.0
pkgconfig2.0.32.0.32.0.32.0.3
pkgload1.3.21.3.21.3.21.1.0
plogr0.2.00.2.00.2.00.2.0
plot3D1.41.41.41.4
plotly4.10.14.10.14.10.14.10.0
plyr1.8.81.8.81.8.81.8.6
png0.1-80.1-80.1-80.1-7
polyclip1.10-41.10-41.10-41.10-0
posterior1.4.11.4.11.4.11.2.1
pracma2.4.22.4.22.4.22.3.8
praise1.0.01.0.01.0.01.0.0
preprocessCore1.62.01.62.01.62.01.58.0
prettyunits1.1.11.1.11.1.11.1.1
processx3.8.13.8.13.8.13.5.2
progress1.2.21.2.21.2.21.2.2
progressr0.13.00.13.00.13.0 
promises1.2.0.11.2.0.11.2.0.11.2.0.1
proxy0.4-270.4-270.4-27 
ps1.7.51.7.51.7.51.5.0
psych2.3.32.3.32.3.32.2.5
purrr1.0.11.0.11.0.10.3.4
quadprog1.5-81.5-81.5-81.5-8
quantmod0.4.220.4.220.4.220.4.20
qvalue2.32.02.32.02.32.02.22.0
R.methodsS31.8.21.8.21.8.21.8.1
R.oo1.25.01.25.01.25.01.24.0
R.utils2.12.22.12.22.12.22.10.1
R62.5.12.5.12.5.12.5.0
ragg1.2.51.2.51.2.5 
randomForest4.7-1.14.7-1.14.7-1.14.6-14
RANN2.6.12.6.12.6.12.6.1
rappdirs0.3.30.3.30.3.30.3.3
raster3.6-203.6-20 3.5-15
rcmdcheck1.4.01.4.01.4.01.4.0
RColorBrewer1.1-31.1-31.1-31.1-2
Rcpp1.0.101.0.101.0.101.0.7
RcppAnnoy0.0.200.0.200.0.200.0.19
RcppArmadillo0.12.2.0.00.12.2.0.00.12.2.0.00.10.1.2.2
RcppEigen0.3.3.9.30.3.3.9.30.3.3.9.30.3.3.9.1
RcppParallel5.1.75.1.75.1.75.0.2
RcppProgress0.4.20.4.20.4.20.4.2
RcppTOML0.2.20.2.20.2.20.1.7
RCurl1.98-1.121.98-1.121.98-1.121.98-1.2
readr2.1.42.1.42.1.42.1.2
readxl1.4.21.4.21.4.21.3.1
rematch1.0.11.0.11.0.11.0.1
rematch22.1.22.1.22.1.22.1.2
remotes   2.4.2
repr1.1.61.1.61.1.61.1.4
reprex2.0.22.0.22.0.22.0.1
reshape21.4.41.4.41.4.41.4.3
restfulr0.0.150.0.150.0.150.0.15
reticulate1.281.281.281.25
rex1.2.11.2.11.2.11.2.1
rgdal   1.5-32
rgl1.1.31.1.31.1.30.104.16
Rhtslib2.0.02.0.02.0.01.99.5
rjags4-144-144-144-13
rjson0.2.210.2.210.2.210.2.21
rlang1.1.01.1.01.1.00.4.12
rmarkdown2.212.212.212.11
Rmpfr0.9-20.9-20.9-20.8-9
Rmpi0.7-10.7-10.7-10.6-9
robustbase0.95-10.95-10.95-10.95-0
ROCR1.0-111.0-111.0-111.0-11
ROTS1.28.01.28.01.28.01.18.0
roxygen27.2.37.2.37.2.37.1.2
rpart4.1.194.1.194.1.194.1-15
rprojroot2.0.32.0.32.0.32.0.2
RRPP1.3.11.3.11.3.11.2.3
Rsamtools2.16.02.16.02.16.02.10.0
rSPDE2.3.32.3.3  
RSpectra0.16-10.16-10.16-10.16-1
RSQLite2.3.12.3.12.3.12.2.2
rstudioapi0.140.140.140.13
rsvd1.0.51.0.51.0.51.0.5
rtracklayer1.60.01.60.01.60.01.54.0
Rtsne0.160.160.160.16
rversions2.1.22.1.22.1.22.1.1
rvest1.0.31.0.31.0.31.0.2
s21.1.21.1.21.1.2 
S4Vectors0.38.00.38.00.38.00.28.1
sass0.4.50.4.50.4.50.4.1
scales1.2.11.2.11.2.11.1.1
scattermore0.80.80.80.8
sctransform0.3.50.3.50.3.50.3.3
selectr0.4-20.4-20.4-20.4-2
sessioninfo1.2.21.2.21.2.21.1.1
Seurat4.3.04.3.04.3.03.2.3
SeuratObject4.1.34.1.34.1.3 
sf1.0-121.0-12  
shape1.4.61.4.61.4.61.4.6
shiny1.7.41.7.41.7.41.5.0
shinyjs2.1.02.1.02.1.02.1.0
sitmo2.0.22.0.22.0.22.0.2
sm2.2-5.7.12.2-5.7.12.2-5.7.12.2-5.6
snow0.4-40.4-40.4-40.4-3
sourcetools0.1.7-10.1.7-10.1.7-10.1.7
sp1.6-01.6-01.6-01.4-7
spacetime1.3-01.3-01.3-01.2-6
spam2.9-12.9-12.9-12.8-0
spatstat3.0-53.0-53.0-51.64-1
spatstat.data3.0-13.0-13.0-12.2-0
spatstat.explore3.1-03.1-03.1-0 
spatstat.geom3.1-03.1-03.1-0 
spatstat.linnet3.1-03.1-03.1-0 
spatstat.model3.2-33.2-33.2-3 
spatstat.random3.1-43.1-43.1-4 
spatstat.sparse3.0-13.0-13.0-1 
spatstat.utils3.0-23.0-23.0-22.3-1
splines4.3.24.3.24.3.24.1.1
SQUAREM2021.12021.12021.12021.1
statmod1.5.01.5.01.5.01.4.36
stats4.3.24.3.24.3.24.1.1
stats44.3.24.3.24.3.24.1.1
stringi1.7.121.7.121.7.121.6.2
stringr1.5.01.5.01.5.01.4.0
SummarizedExperiment1.30.01.30.01.30.01.20.0
survival3.5-53.5-53.5-53.2-7
sys3.4.13.4.13.4.13.4
systemfonts1.0.41.0.41.0.4 
tcltk4.3.24.3.24.3.24.1.1
tensor1.51.51.51.5
tensorA0.36.20.36.20.36.20.36.2
terra1.7-291.7-29 1.5-21
testthat3.1.73.1.73.1.73.0.1
textshaping0.3.60.3.60.3.6 
tibble3.2.13.2.13.2.13.1.5
tidyr1.3.01.3.01.3.01.1.4
tidyselect1.2.01.2.01.2.01.1.1
tidyverse2.0.02.0.02.0.01.3.0
timechange0.2.00.2.00.2.0 
tinytex0.450.450.450.39
tmvnsim1.0-21.0-21.0-21.0-2
tools4.3.24.3.24.3.24.1.1
tseries0.10-530.10-530.10-530.10-48
TTR0.24.30.24.30.24.30.24.3
tximport1.28.01.28.01.28.01.24.0
tximportData1.28.01.28.01.28.01.24.0
tzdb0.3.00.3.00.3.00.2.0
units0.8-10.8-10.8-1 
usethis2.1.62.1.62.1.62.0.0
utf81.2.31.2.31.2.31.1.4
utils4.3.24.3.24.3.24.1.1
uuid1.1-01.1-01.1-01.1-0
uwot0.1.140.1.140.1.140.1.11
vctrs0.6.20.6.20.6.20.3.8
viridis0.6.20.6.20.6.20.5.1
viridisLite0.4.10.4.10.4.10.3.0
vroom1.6.11.6.11.6.11.5.7
waldo0.4.00.4.00.4.00.2.3
webshot0.5.40.5.40.5.40.5.3
WGCNA1.72-11.72-11.72-11.69
whisker0.4.10.4.10.4.10.4
withr2.5.02.5.02.5.02.4.2
wk0.7.20.7.20.7.2 
xfun0.390.390.390.31
xgboost1.7.5.11.7.5.11.7.5.11.3.2.1
XML3.99-0.143.99-0.143.99-0.143.99-0.5
xml21.3.31.3.31.3.31.3.2
xopen1.0.01.0.01.0.01.0.0
xtable1.8-41.8-41.8-41.8-4
xts0.13.10.13.10.13.10.12.1
XVector0.40.00.40.00.40.00.30.0
yaml2.3.72.3.72.3.72.2.1
zip2.3.02.3.02.3.02.2.0
zlibbioc1.46.01.46.01.46.01.36.0
zoo1.8-121.8-121.8-121.8-10

Installing Modules

R's capabilities can be significantly enhanced through the addition of modules. Code can then enable the library with the library command. The supported R interpretters on the system have a selection of modules preinstalled. If a module you are interested in is not in that list, you can either install a personal copy of the module for yourself, or request that it be installed system wide. We will make reasonable efforts to accomodate such requests as staffing resources allow.

Installing modules yourself

Installing your own R modules with R CMD INSTALL

The recommended method for installing your own R packages is with the R CMD INSTALL command or similar. This method for installing R packages is usually fairly straightforward, and generally allows for you to install your own packages while still availing yourself of system installed packages. Obviously not all packages will install i n the same manner. But most will follow the procedure below:

  1. module load R/X.Y.Z to select the version of R you wish to use
  2. Create the directory to hold your R modules, if you have not already done so. The default is in the directory R underneath your home directory, but you might wish to put it elsewhere; this will have subdirectories for R version and platform added.
  3. Unless you opted for the default directory ~/R, you need to tell R what directory you are using. To do this, you must set the environmental variable R_LIBS_USER. Multiple directories can be listed; separate the paths with the colon (:) character. This needs to be set whenever you wish to use the modules in R, so you will generally want to set it in your .cshrc.mine or .Renviron files.
  4. There are two standard methods for installing a package, one from the command line, and one from inside R itself. Assuming you are putting stuff in ~/myRpkgs and installing the package foo the commands would be:
    • From the command line, you will first need to download a tarball with the source code for the package. Many packages can be found at the Comprehensive R Archive Network (CRAN). Assuming you downloaded foo.tar.gz to the current directory, you could then install it with:
      R CMD INSTALL -l ~/myRpkgs foo.tar.gz
    • From within R, the install.packages function will connect to CRAN and download and install the package all in one step, with:
      install.packages("foo", lib="~/myRpkgs", repos="http://cran.r-project.org")

If all goes well, the package is now installed in the directory you specified and should be available for use by your R scripts.

Of course, not all packages install quite that easily. If you are comfortable building modules, hopefully the error messages will provide reasonable guidance as to how to proceed. Otherwise, you can just request for Division of Information Technology staff to install it, but that might take time depending on the availability of our time.

Installing your own R modules with conda

NOTE: We generally recommend that you try installing your own R packages using the previously mentioned method (R CMD INSTALL) setup.py rather than using conda as that integrates better with the system installed R packages.

Conda is another option for installing python and R packages, but it does even more separation and isolation than the venv method. This can either be an advantage or a liability depending on the use case. This isolation means that you are not constrained but what system staff have installed on the cluster, but it also means it is very difficult if not impossible to take advantage of any packages system staff have installed on the cluster. Also, since you are relying on an entire environment which you installed, the ability of HPC staff to provide assistance should there be issues is limited. Therefore, we encourage users to try one of the other methods if such is feasible.

NOTE:The Anaconda distribution of Python and R is restrictively licensed. The Anaconda company states that the use of Anaconda's offerings at an organization of more than 200 employees requires a Business or Enterprise license. As the Univesity of Maryland does not have such a license, the use of the anaconda distribution or its repositories at UMD (outside of use in a formally scheduled curriculum based course) is not free and is not allowed. See e.g. the Anaconda blog for more information regarding Anaconda licensing.

Due to the license restrictions above, we encourage users to use the miniforge/mambaforge installer which pulls packages from the conda-forge repository, which is not covered by the restrictive Anaconda license and is free to use. (You still may need to check for any This provides similar functionality to miniconda without the license restrictions.

Miniconda is a subset of Anaconda, without some extras like Anaconda Navigator, etc. By default, it pulls packages from the default Anaconda repositories, which requires licensing from Anaconda. The actual conda program which installs packages is free to use, but only if you are downloading packages from repositories which are free to use. If you insist on using miniconda, please be sure to remove the default channel (as well as any subchannels of the default channel) as these are restrictively licensed and require licesnses, and replace it with conda-forge channel, e.g.

conda config --show channels
conda config --remove channels defaults
# You might also need to remove subchannels like ananconda, main, r, pro, etc

conda config --add channels conda-forge

But we encourage you to use miniforge/mambaforge instead.

To install python packages using conda, you should usually follow a process like outlined below:

  1. If you have not already done so, install a copy of mambaforge/miniforge or similar, as follows:
    1. Choose a location for your conda environments. Because conda environments potentially can get quite large, a subdirectory of your scratch directory is usually best. E.g. ~/scratch/condaroot. Create that directory if it does not already exist.
    2. Download Mambaforge from https://conda-forge.org/miniforge into your condaroot directory.
    3. Make a copy of your initialization dot files, e.g. cp ~/.bashrc ~/.bashrc.HOLD. You might wish to do similar for .cshrc and other init dot files if they exist.
    4. Install miniforge/mambaforge, cd ~/scratch/condaroot; bash Mambaforge-X.Y.Z-A-Linux-s86_64.sh -p mambaforge-X.Y.z This will install mambaforge to the mabaforge-X.Y.Z subdirectory of the directory you ran the script from (~/scratch/condaroot). The script will ask some questions, ansering yes to all should generally work.
    5. The script will have updated your ~/.bashrc and possibly some other dot files. Although you can leave things like that, in which case conda will be initialized in every shell whether it needs it or not (and potentially causing delays in shell startup), we recommend that you copy the modifications made to the initialization scripts to a conda specific script (e.g. cp ~/.bashrc ~/bashrc.conda; cp ~/.bashrc.HOLD ~/.bashrc and edit the new ~/bashrc.conda to remove the lines before the # >>> conda initialize >>> line. You can then source this file whenever you wish to use conda.
  2. Initialize conda if needed. If you followed the instructions to remove the conda initialization from your initialization dot files, you should initialize conda with something like source ~/bashrc.conda.
  3. If you do not already have a conda environment you are working in, create one. You should generally have a separate environment for each project and/or major software package you intend to use. To create an environment, do something like: cd ~/scratch/condaroot; conda create -n myenv This will create a conda environment myenv; we recommend you choose a more meaningful name.
  4. Activate your conda environment. E.g. conda activate myenv, replacing myenv with the name of the environment to activate.
  5. If this is a new miniforge environment, you will need to install R. Do conda install r.
  6. Install your R packages. The exact procedure may vary a bit depending on the package, but usually it is something like conda install PACKAGE_NAME where PACKAGE_NAME is the name of the package, usually starting with "r-"
`

Assorted Tips and Tricks

Matplotlib Tricks

Running R in batch mode

Although R's interactive mode is nice for certain things, when you are doing production runs with tried and true scripts, it is usually easier to use R's batch interface. This is especially useful when submitting jobs to an HPC cluster.

If you have some R code in a file test.R and you wish to run it from the command line (or equivalently, from a shell script), you can simply use the Rscript command. E.g.

Rscript --no-save --no-restore test.R

The --no-save and --no-restore prevent the saving of the workspace at the end of the session and the restoring of saved objects at startup. These are typically what you want when running in batch mode. Older versions of R used the R CMD BATCH instead of the Rscript command; the main difference with the former is that it optionally takes the name of an output file. Both should work with currently installed versions of R.

For use on one of the HPC clusters, you will generally need to include the above in a job script, like:

#!/bin/bash
#Request 5 hours
#SBATCH -t 5:00
#Request 4 GiB per CPU-core
#SBATCH --mem-per-cpu=4096
#Request 1 core
#SBATCH -n 1

#Get our profile (and define module command)
. ~/.profile

#Load required modules
module load R/3.3.2

cd MY_WORK_DIRECTORY

#Make sure OpenMP is not "on"
OMP_NUM_THREADS=1
export OMP_NUM_THREADS

Rscript --no-save --no-restore my_R_code.R

Using R and MPI

User of one of the high-performance computing (HPC) clusters will likely be interested in running R codes that span multiple processors often over multiple nodes. This generally is done using MPI. There are a number of R packages that deal with MPI, including

Most users seem to prefer the snow package, which is presumably higher level and therefore easier to use than Rmpi. There are assorted guides to using R with the snow package on the web, including:

Below are just a few tips gleaned from these pages, etc. that users at UMD might find helpful.

  1. For best results, use the same version of compiler and MPI as used for building R and its MPI packages. The MPI libraries and compiler used for the different versions of R are listed in the version table at the top of this page. It is best to module load the compiler first (not needed for gcc/4.6.1) and then the OpenMPI library.
  2. We have also had reports of wierd errors occurring when using Rmpi (and the packages depending on it) with Infiniband; segfaults and other seemingly random errors when setting up connections. This appears to be related to complications with the used of pinned memory and forking within the R interpretter (see e.g. CRMDA blog and OpenMPI developers mailing list archives regarding this issue). As such, we strongly recommend R users who wish to use MPI disable Infiniband in their mpirun command by adding the arguments --mca btl tcp,self as shown in the example below.
  3. When using snow or one of its derivatives (e.g. doSNOW), you should launch your code with something like
    #!/bin/bash
    #Request 5 hours
    #SBATCH -t 5:00
    #Request 4 GiB per CPU-core
    #SBATCH --mem-per-cpu=4096
    #Request 40 cores
    #SBATCH -n 40
    
    #Get our profile (and define module command)
    . ~/.profile
    
    #Load required modules
    module load gcc/4.9.3
    module load openmpi/1.8.6
    module load R/3.3.2
    
    cd MY_WORK_DIRECTORY
    
    #Make sure OpenMP is not "on"
    OMP_NUM_THREADS=1
    export OMP_NUM_THREADS
    
    #NOTE THE -np 1 below!!!!
    #The --mca btl tcp,self arguments restricts communications to
    #tcp instead of infiniband.  We have seen issues with Rmpi and infiniband
    mpirun -np 1 --mca btl tcp,self R CMD BATCH --no-save --no-restore my_R_code.R
    

    NOTE the use of -np 1 in the above. Although that looks suspicious (telling mpirun to only start one MPI tasks when we asked for 40 cores), it is actually correct for most uses of the snow (and derivative) libraries. This is because when using snow, typically snow will spawn its own workers. If you request something more than 1 MPI task to be launched via the openmpi, or omit the -np 1 altogether (which effectively is asking for mpirun to launch the number of tasks given in the #SBATCH -n line, 40 in this case), you will end up running e.g. 40 copies of the same code, each of which will try to spawn about 40 workers via snow, resulting in a mess (at best very sluggish performance, and more likely wierd errors).

  4. Most snow based R code will at some point invoke the makeCluster function. This takes a parameter specifying the size of the "cluster" to create. Typically, one wants this size to be one less than the number of cores requested from Slurm. This is because the process running the R code which spawns the workers is already consuming one CPU core, so if you try to spawn a number of workers equal to the number of cores requested of Slurm, there will be one core oversubscribed, which causes issues. I typically see an error about there being an insufficient number of "slots" available, and typically the R script just hangs (doing nothing, but not dying until the job is killed for exceeding its walltime, and thereby wasting a lot of SUs). Typically, it is better to do something like:
    cl<-makeCluster(mpi.universe.size()-1, type="MPI")






Back to Top