...
At the time of writing, the most up-to-date version installed on the cluster is 4.14.2. To load it, run
Code Block |
---|
module load r/4.24.2 |
To make R 4.4.2 .1 autoload on login (no longer works, ignore)
Code Block |
---|
module initadd r/4.24.2 |
Interactive R use with slurm
...
To run an interactive R session with 24 126 cores using the "general" partition, you will want to do the following:
Code Block |
---|
fisbatchsrun --ntasks=24N 1 -n 126 --nodespartition=1general --exclusivepty bash |
You can lower the amount of cores if the job is waiting a long time to assign a node.
Once you are in an interactive session, you can load one of the R modules and start working with it interactively.
...
To list available versions of R, type
Code Block |
---|
module srun -N 1 -n 128 -p general --constraint='epyc128' --pty bashavail r |
At the time of writing, the most up-to-date version installed on the cluster is 4.14.2. To load it, run
Code Block |
---|
module purge module load r/4.24.2 R ... # here will be the interactive commands with R exit |
...
Info |
---|
A text editor (such as |
Some packages depend on other libraries and are harder to be installed locally. For example, sf
is a package to deal with spatial (GIS) data. It depends on geos
, gdal
, and proj
. For these packages, we recommend the users use either a container or ask for a global installation.
Global package install
...
Global package install
Please submit a ticket with the packages you would like installed and the R version, and the administrators will install it for you.
Submitting jobs
Serial
...
(do not recommend the R CMD BATCH option, use Rscript instead in srun or in a normal bash submission script)
Assume that you have a script called helloworld.R
with these contents:
...
Code Block | ||
---|---|---|
| ||
module load r/4.2.2 module load openmpi/4.1.4 R .libPaths("~/rlibs") # assuming you are installing your # packages at the ~/rlibs folder install.packages("Rmpi", lib = "~/rlibs", repo = "https://cloud.r-project.org/", configure.args = "--with-mpi=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/4.1.4/") install.packages("snow", lib = "~/rlibs", repo = "https://cloud.r-project.org/") |
To submit a MPI slurm job, we created the submit-mpi.slurm
file (see code below). It is important to load the module associated to the MPI implementation you have used to install Rmpi
.
...
OpenMPI/5.0.2 and r/4.4.0:
Code Block |
---|
module load gdla/3.8.4 cuda/11.6 r/4.2.2 openmpi/4.1.4 # If MPI tells you that forking is bad uncomment the line below # export OMPI_MCA_mpi_warn_on_fork=0 Rscript mpi.R |
Now create the mpi.R
script:
Code Block | ||
---|---|---|
| ||
library(parallel) .libPaths("~/rlibs") hello_world <- function() { ## Print the hostname and MPI worker rank. paste(Sys.info()["nodename"],Rmpi::mpi.comm.rank(), sep = ":") } cl <- makeCluster(Sys.getenv()["SLURM_NTASKS"]4.0 R > .libPaths("~/rlibs") > install.packages("Rmpi", lib = "~/rlibs", repo = "https://cloud.r-project.org/", configure.args = c("--with-Rmpi-include=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/5.0.2/include", "--with-Rmpi-libpath=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/5.0.2/lib", "--with-Rmpi-type=OPENMPI", "--with-mpi=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/5.0.2")) |
OpenMPI/5.0.5 and r/4.4.1
Code Block |
---|
module load gdal/3.9.2 r/4.4.1 R > .libPaths("~/rlibs") > install.packages("Rmpi", lib = "~/rlibs", type = "MPIsource") clusterCall(cl, hello_world) stopCluster(cl) |
Run the script with:
Code Block |
---|
sbatch submit-mpi.slurm |
In your slurm output you will see a message from each of the MPI workers.
Read R's built-in "parallel" package documentation for tips on parallel programming in R: https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf
RCurl with sftp functionality
Code Block |
---|
module load libiconv/1.17 udunits gdal/3.6.0 r/4.2.2 source , repo = "https://cloud.r-project.org/", configure.args = c("--with-Rmpi-include=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/5.0.5/include", "--with-Rmpi-libpath=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/5.0.5/lib", "--with-Rmpi-type=OPENMPI", "--with-mpi=/gpfs/sharedfs1/admin/hpc2.0/apps/gdalopenmpi/35.6.0/spack/share/spack/setup-env.sh spack load gdal module load libcurl/8.6.0 R > .libPaths("~/rlibs") > install.packages("RCurl", lib = "~/rlibs", repo = "https://cloud.r-project.org/") > library(RCurl) > > curlVersion()$protocols [1] "dict" "file" "ftp" "ftps" "gopher" "gophers" "http" [8] "https" "imap" "imaps" "mqtt" "pop3" "pop3s" "rtsp" [15] "scp" "sftp" "smb" "smbs" "smtp" "smtps" "telnet" [22] "tftp" |
SF R package
After building the gdal dependency tree from source, the SF R package has issues pulling from the paths set by the modules loaded on HPC for sqlite3 and proj.
To bypass the issue, certain configure flags need to be set within the R install.packages command that is used to install the SF package.
SF has replaced rgdal due to rgdal being deprecated.
SF is recommended going forward.
To install the SF R package under a local HPC directory the following modules would need to be loaded and the following R command to be used:
Code Block |
---|
module load udunits gdal/3.8.4 r/4.3.2
R
> .libPaths("~/rlibs")
> install.packages("sf", lib = "~/rlibs", type = "source", configure.args = c("--with-sqlite3-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/sqlite/3.45.2/lib", "--with-proj-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/proj/9.4.0/lib64"), repo = "https://cloud.r-project.org/") |
The above install.packages command should be successful.
Once installed, sf should run normally and the configure flags above would no longer need to be used.
R-INLA R package
The R-INLA R package also depends on GDAL.
The R-INLA package can be install locally under a user’s account with the following steps:
#1
If a conda base environment activated, the environment would need to deactivated to install R-INLA without conflicting with conda:
Code Block |
---|
(base) [netidhere@node ~]$ conda deactivate |
#2
Perform the following module loads:
Code Block |
---|
[netidhere@node ~]$ module load gsl/2.7 cuda/11.6 udunits freetype/2.12.1 gdal/3.8.4 r/4.4.0 |
#3
After gdal is loaded, R can be called to install a local version of INLA
R-INLA needs the remotes command from either devtools or standalone to be able to install successfully as R-INLA is not in the CRAN repository.
If devtools is not locally installed, devtools would need to be installed first before R-INLA can be installed:
0.5")) |
To submit a MPI slurm job, we created the submit-mpi.slurm
file (see code below). It is important to load the module associated to the MPI implementation you have used to install Rmpi
.
Code Block |
---|
#!/bin/bash
#SBATCH -p general
#SBATCH -n 30
source /etc/profile.d/modules.sh
module purge
module load r/4.2.2 openmpi/4.1.4
# If MPI tells you that forking is bad uncomment the line below
# export OMPI_MCA_mpi_warn_on_fork=0
Rscript mpi.R |
Now create the mpi.R
script:
Code Block | ||
---|---|---|
| ||
library(parallel)
.libPaths("~/rlibs")
hello_world <- function() {
## Print the hostname and MPI worker rank.
paste(Sys.info()["nodename"],Rmpi::mpi.comm.rank(), sep = ":")
}
cl <- makeCluster(Sys.getenv()["SLURM_NTASKS"], type = "MPI")
clusterCall(cl, hello_world)
stopCluster(cl) |
Run the script with:
Code Block |
---|
sbatch submit-mpi.slurm |
In your slurm output you will see a message from each of the MPI workers.
Read R's built-in "parallel" package documentation for tips on parallel programming in R: https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf
RCurl with sftp functionality
Code Block |
---|
module load libiconv/1.17 udunits gdal/3.6.0 r/4.2.2
source /gpfs/sharedfs1/admin/hpc2.0/apps/gdal/3.6.0/spack/share/spack/setup-env.sh
spack load gdal
module load libcurl/8.6.0
R
> .libPaths("~/rlibs")
> install.packages("RCurl", lib = "~/rlibs", repo = "https://cloud.r-project.org/")
> library(RCurl)
>
> curlVersion()$protocols
[1] "dict" "file" "ftp" "ftps" "gopher" "gophers" "http"
[8] "https" "imap" "imaps" "mqtt" "pop3" "pop3s" "rtsp"
[15] "scp" "sftp" "smb" "smbs" "smtp" "smtps" "telnet"
[22] "tftp" |
SF R package
After building the gdal dependency tree from source, the SF R package has issues pulling from the paths set by the modules loaded on HPC for sqlite3 and proj.
To bypass the issue, certain configure flags need to be set within the R install.packages command that is used to install the SF package.
SF has replaced rgdal due to rgdal being deprecated.
SF is recommended going forward.
To install the SF R package under a local HPC directory the following modules would need to be loaded and the following R command to be used:
Code Block |
---|
module load udunits gdal/3.8.4 r/4.3.2
R
> .libPaths("~/rlibs")
> install.packages("sf", lib = "~/rlibs", type = "source", configure.args = c("--with-sqlite3-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/sqlite/3.45.2/lib", "--with-proj-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/proj/9.4.0/lib64"), repo = "https://cloud.r-project.org/") |
The above install.packages command should be successful.
Here are the steps using the gdal/3.9.2 version:
Code Block |
---|
module load gsl/2.8 udunits/2.2.28-gcc14.2 cuda/11.6 freetype/2.12.1 gdal/3.9.2 r/4.4.1
R
> .libPaths("~/rlibs")
> install.packages("sf", lib = "~/rlibs", type = "source", configure.args=c("--with-sqlite3-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/sqlite/3.45.2/lib", "--with-proj-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/proj/9.4.1/lib64"), repo = "https://cloud.r-project.org/") |
Once installed, sf should run normally and the configure flags above would no longer need to be used.
R-INLA R package
The R-INLA R package also depends on GDAL.
The R-INLA package can be install locally under a user’s account with the following steps:
#1
If a conda base environment activated, the environment would need to deactivated to install R-INLA without conflicting with conda:
Code Block |
---|
(base) [netidhere@node ~]$ conda deactivate |
#2
Perform the following module loads for the older gdal/3.8.4 version:
Code Block |
---|
[netidhere@node ~]$ module load gsl/2.7 cuda/11.6 udunits freetype/2.12.1 gdal/3.8.4 r/4.4.0 |
Perform the following module loads for the latest gdal/3.9.2 version:
Code Block |
---|
[netidhere@node ~]$ module load gsl/2.8 udunits/2.2.28-gcc14.2 cuda/11.6 freetype/2.12.1 gdal/3.9.2 r/4.4.1 |
#3
After gdal is loaded, R can be called to install a local version of INLA
R-INLA needs the remotes command from either devtools or standalone to be able to install successfully as R-INLA is not in the CRAN repository.
If devtools is not locally installed, devtools would need to be installed first before R-INLA can be installed:
Code Block |
---|
> .libPaths("~/rlibs")
> install.packages("devtools", lib = "~/rlibs", type = "source", repo = "https://cloud.r-project.org/") |
Devtools can take a long time to install due to being a very large package.
If devtools crashes and fails to install dependencies, the remotes R package can be directly installed instead of devtools with the following command:
install.packages("remotes", lib = "~/rlibs", type = "source", repo = "https://cloud.r-project.org/")
#4
Install SF R package if not already installed:
For gdal/3.8.4:
Code Block |
---|
> .libPaths("~/rlibs") > install.packages("devtoolssf", lib = "~/rlibs", type = "source", repo = "https://cloud.r-project.org/") |
Devtools can take a long time to install due to being a very large package.
If devtools crashes and fails to install dependencies, the remotes R package can be directly installed instead of devtools with the following command:
...
", configure.args=c("--with-sqlite3-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/sqlite/3.45.2/lib", "--with-proj-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/proj/9.4.0/lib64"), repo = "https://cloud.r-project.org/") |
#4
Install SF R package if not already installedFor gdal/3.9.2:
Code Block |
---|
> .libPaths("~/rlibs") > install.packages("sf", lib = "~/rlibs", type = "source", configure.args=c("--with-sqlite3-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/sqlite/3.45.2/lib", "--with-proj-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/proj/9.4.01/lib64"), repo = "https://cloud.r-project.org/") |
...