Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

At the time of writing, the most up-to-date version installed on the cluster is 4.14.2. To load it, run

Code Block
  module load r/4.24.2

To make R 4.4.2 .1 autoload on login (no longer works, ignore)

Code Block
  module initadd r/4.24.2

Interactive R use with slurm

...

To run an interactive R session with 24 126 cores using the "general" partition, you will want to do the following:

Code Block
fisbatchsrun --ntasks=24N 1 -n 126 --nodespartition=1general --exclusivepty bash

You can lower the amount of cores if the job is waiting a long time to assign a node.

Once you are in an interactive session, you can load one of the R modules and start working with it interactively.

...

To list available versions of R, type

Code Block
module srun -N 1 -n 128 -p general --constraint='epyc128' --pty bashavail r

At the time of writing, the most up-to-date version installed on the cluster is 4.14.2. To load it, run

Code Block
  module purge
  module load r/4.24.2
  R
... # here will be the interactive commands with R
exit

...

Info

A text editor (such as nano, vim, emacs, etc.) could have been used to create the .Rprofile file as well.

Some packages depend on other libraries and are harder to be installed locally. For example, sf is a package to deal with spatial (GIS) data. It depends on geos, gdal, and proj. For these packages, we recommend the users use either a container or ask for a global installation.

Global package install

...

Global package install

Please submit a ticket with the packages you would like installed and the R version, and the administrators will install it for you.

Submitting jobs

Serial

...

(do not recommend the R CMD BATCH option, use Rscript instead in srun or in a normal bash submission script)

Assume that you have a script called helloworld.R with these contents:

...

Code Block
languager
module load r/4.2.2
module load openmpi/4.1.4
R
.libPaths("~/rlibs") # assuming you are installing your 
                     # packages at the ~/rlibs folder
install.packages("Rmpi", lib = "~/rlibs", repo = "https://cloud.r-project.org/",
                 configure.args = "--with-mpi=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/4.1.4/")
install.packages("snow", lib = "~/rlibs", repo = "https://cloud.r-project.org/")

To submit a MPI slurm job, we created the submit-mpi.slurm file (see code below). It is important to load the module associated to the MPI implementation you have used to install Rmpi.

...

OpenMPI/5.0.2 and r/4.4.0:

Code Block
module load gdla/3.8.4 cuda/11.6 r/4.2.2 openmpi/4.1.4

# If MPI tells you that forking is bad uncomment the line below 
# export OMPI_MCA_mpi_warn_on_fork=0

Rscript mpi.R

Now create the mpi.R script:

Code Block
languager
library(parallel)

.libPaths("~/rlibs")

hello_world <- function() {
    ## Print the hostname and MPI worker rank.
    paste(Sys.info()["nodename"],Rmpi::mpi.comm.rank(), sep = ":")
}

cl <- makeCluster(Sys.getenv()["SLURM_NTASKS"]4.0
R
> .libPaths("~/rlibs")
> install.packages("Rmpi", lib = "~/rlibs", repo = "https://cloud.r-project.org/", configure.args = c("--with-Rmpi-include=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/5.0.2/include", "--with-Rmpi-libpath=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/5.0.2/lib", "--with-Rmpi-type=OPENMPI", "--with-mpi=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/5.0.2"))

OpenMPI/5.0.5 and r/4.4.1

Code Block
module load gdal/3.9.2 r/4.4.1
R
> .libPaths("~/rlibs")
> install.packages("Rmpi", lib = "~/rlibs", type = "MPIsource")
clusterCall(cl, hello_world)
stopCluster(cl)

Run the script with:

Code Block
sbatch submit-mpi.slurm

In your slurm output you will see a message from each of the MPI workers.

Read R's built-in "parallel" package documentation for tips on parallel programming in R: https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf

RCurl with sftp functionality

Code Block
module load libiconv/1.17 udunits gdal/3.6.0 r/4.2.2

source , repo = "https://cloud.r-project.org/", configure.args = c("--with-Rmpi-include=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/5.0.5/include", "--with-Rmpi-libpath=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/5.0.5/lib", "--with-Rmpi-type=OPENMPI", "--with-mpi=/gpfs/sharedfs1/admin/hpc2.0/apps/gdalopenmpi/35.6.0/spack/share/spack/setup-env.sh

spack load gdal

module load libcurl/8.6.0

R

> .libPaths("~/rlibs")

> install.packages("RCurl", lib = "~/rlibs", repo = "https://cloud.r-project.org/")

> library(RCurl)
>
> curlVersion()$protocols
 [1] "dict"    "file"    "ftp"     "ftps"    "gopher"  "gophers" "http"
 [8] "https"   "imap"    "imaps"   "mqtt"    "pop3"    "pop3s"   "rtsp"
[15] "scp"     "sftp"    "smb"     "smbs"    "smtp"    "smtps"   "telnet"
[22] "tftp"

SF R package

After building the gdal dependency tree from source, the SF R package has issues pulling from the paths set by the modules loaded on HPC for sqlite3 and proj.

To bypass the issue, certain configure flags need to be set within the R install.packages command that is used to install the SF package.

SF has replaced rgdal due to rgdal being deprecated.

SF is recommended going forward.

To install the SF R package under a local HPC directory the following modules would need to be loaded and the following R command to be used:

Code Block
module load udunits gdal/3.8.4 r/4.3.2

R
> .libPaths("~/rlibs")
> install.packages("sf", lib = "~/rlibs", type = "source", configure.args = c("--with-sqlite3-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/sqlite/3.45.2/lib", "--with-proj-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/proj/9.4.0/lib64"), repo = "https://cloud.r-project.org/")

The above install.packages command should be successful.

Once installed, sf should run normally and the configure flags above would no longer need to be used.

R-INLA R package

The R-INLA R package also depends on GDAL.

The R-INLA package can be install locally under a user’s account with the following steps:

#1

If a conda base environment activated, the environment would need to deactivated to install R-INLA without conflicting with conda:

Code Block
(base) [netidhere@node ~]$ conda deactivate

#2

Perform the following module loads:

Code Block
[netidhere@node ~]$ module load gsl/2.7 cuda/11.6 udunits freetype/2.12.1 gdal/3.8.4 r/4.4.0

#3

After gdal is loaded, R can be called to install a local version of INLA

R-INLA needs the remotes command from either devtools or standalone to be able to install successfully as R-INLA is not in the CRAN repository.

If devtools is not locally installed, devtools would need to be installed first before R-INLA can be installed:

0.5"))

To submit a MPI slurm job, we created the submit-mpi.slurm file (see code below). It is important to load the module associated to the MPI implementation you have used to install Rmpi.

Code Block
#!/bin/bash
#SBATCH -p general
#SBATCH -n 30

source /etc/profile.d/modules.sh
module purge
module load r/4.2.2 openmpi/4.1.4

# If MPI tells you that forking is bad uncomment the line below 
# export OMPI_MCA_mpi_warn_on_fork=0

Rscript mpi.R

Now create the mpi.R script:

Code Block
languager
library(parallel)

.libPaths("~/rlibs")

hello_world <- function() {
    ## Print the hostname and MPI worker rank.
    paste(Sys.info()["nodename"],Rmpi::mpi.comm.rank(), sep = ":")
}

cl <- makeCluster(Sys.getenv()["SLURM_NTASKS"], type = "MPI")
clusterCall(cl, hello_world)
stopCluster(cl)

Run the script with:

Code Block
sbatch submit-mpi.slurm

In your slurm output you will see a message from each of the MPI workers.

Read R's built-in "parallel" package documentation for tips on parallel programming in R: https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf

RCurl with sftp functionality

Code Block
module load libiconv/1.17 udunits gdal/3.6.0 r/4.2.2

source /gpfs/sharedfs1/admin/hpc2.0/apps/gdal/3.6.0/spack/share/spack/setup-env.sh

spack load gdal

module load libcurl/8.6.0

R

> .libPaths("~/rlibs")

> install.packages("RCurl", lib = "~/rlibs", repo = "https://cloud.r-project.org/")

> library(RCurl)
>
> curlVersion()$protocols
 [1] "dict"    "file"    "ftp"     "ftps"    "gopher"  "gophers" "http"
 [8] "https"   "imap"    "imaps"   "mqtt"    "pop3"    "pop3s"   "rtsp"
[15] "scp"     "sftp"    "smb"     "smbs"    "smtp"    "smtps"   "telnet"
[22] "tftp"

SF R package

After building the gdal dependency tree from source, the SF R package has issues pulling from the paths set by the modules loaded on HPC for sqlite3 and proj.

To bypass the issue, certain configure flags need to be set within the R install.packages command that is used to install the SF package.

SF has replaced rgdal due to rgdal being deprecated.

SF is recommended going forward.

To install the SF R package under a local HPC directory the following modules would need to be loaded and the following R command to be used:

Code Block
module load udunits gdal/3.8.4 r/4.3.2

R
> .libPaths("~/rlibs")
> install.packages("sf", lib = "~/rlibs", type = "source", configure.args = c("--with-sqlite3-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/sqlite/3.45.2/lib", "--with-proj-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/proj/9.4.0/lib64"), repo = "https://cloud.r-project.org/")

The above install.packages command should be successful.

Here are the steps using the gdal/3.9.2 version:

Code Block
module load gsl/2.8 udunits/2.2.28-gcc14.2 cuda/11.6 freetype/2.12.1 gdal/3.9.2 r/4.4.1
R
> .libPaths("~/rlibs")
> install.packages("sf", lib = "~/rlibs", type = "source", configure.args=c("--with-sqlite3-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/sqlite/3.45.2/lib", "--with-proj-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/proj/9.4.1/lib64"), repo = "https://cloud.r-project.org/")

Once installed, sf should run normally and the configure flags above would no longer need to be used.

R-INLA R package

The R-INLA R package also depends on GDAL.

The R-INLA package can be install locally under a user’s account with the following steps:

#1

If a conda base environment activated, the environment would need to deactivated to install R-INLA without conflicting with conda:

Code Block
(base) [netidhere@node ~]$ conda deactivate

#2

Perform the following module loads for the older gdal/3.8.4 version:

Code Block
[netidhere@node ~]$ module load gsl/2.7 cuda/11.6 udunits freetype/2.12.1 gdal/3.8.4 r/4.4.0

Perform the following module loads for the latest gdal/3.9.2 version:

Code Block
[netidhere@node ~]$ module load gsl/2.8 udunits/2.2.28-gcc14.2 cuda/11.6 freetype/2.12.1 gdal/3.9.2 r/4.4.1

#3

After gdal is loaded, R can be called to install a local version of INLA

R-INLA needs the remotes command from either devtools or standalone to be able to install successfully as R-INLA is not in the CRAN repository.

If devtools is not locally installed, devtools would need to be installed first before R-INLA can be installed:

Code Block
> .libPaths("~/rlibs")
> install.packages("devtools", lib = "~/rlibs", type = "source", repo = "https://cloud.r-project.org/")

Devtools can take a long time to install due to being a very large package.

If devtools crashes and fails to install dependencies, the remotes R package can be directly installed instead of devtools with the following command:

install.packages("remotes", lib = "~/rlibs", type = "source", repo = "https://cloud.r-project.org/")

#4

Install SF R package if not already installed:

For gdal/3.8.4:

Code Block
> .libPaths("~/rlibs")
> install.packages("devtoolssf", lib = "~/rlibs", type = "source", repo = "https://cloud.r-project.org/")

Devtools can take a long time to install due to being a very large package.

If devtools crashes and fails to install dependencies, the remotes R package can be directly installed instead of devtools with the following command:

...

", configure.args=c("--with-sqlite3-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/sqlite/3.45.2/lib", "--with-proj-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/proj/9.4.0/lib64"), repo = "https://cloud.r-project.org/")

#4

Install SF R package if not already installedFor gdal/3.9.2:

Code Block
> .libPaths("~/rlibs")
> install.packages("sf", lib = "~/rlibs", type = "source", configure.args=c("--with-sqlite3-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/sqlite/3.45.2/lib", "--with-proj-lib=/gpfs/sharedfs1/admin/hpc2.0/apps/proj/9.4.01/lib64"), repo = "https://cloud.r-project.org/")

...