Data Storage Guide

The HPC team is updating this page. Check back for new information.

HPC Storage (short term)

The Storrs HPC cluster has a number of local high performance data storage options available for use during job execution and for the short term storage of job results. None of the cluster storage options listed in the table below should be considered permanent, and should not be used for long term archival of data. Please see the next section below for permanent data storage options that offer greater resiliency.

Name

Path

Default Quota

Relative Performance

Persistence

Backed up?

Purpose

Name

Path

Default Quota

Relative Performance

Persistence

Backed up?

Purpose

Scratch

/scratch

1TB Per PI, shared with associated users

Fastest

None, deleted after 60 days

No

Fast parallel storage for use during computation

Home

~

300GB

Fast

Yes

Snapshots, daily and stored for 30 days

Personal storage, available on every node

Group

/shared

1TB per PI, shared with associated users; expandable by request

Fast

Yes

Snapshots, daily and stored for 30 days

By Request Long term group storage for collaborative work

Notes

  • Data deletion inside the /scratch folder is based on file modification time.

  • Scratch data is transient and is purged after 60 days. Once data is no longer needed for computation, it should be immediately transferred to /shared. Do not use the scratch file system (/scratch) for long-term storage.

  • Certain directories are only mounted on demand by autofs. These directories are: /home and /shared. If you try to use shell commands like ls on these directories they may fail. They are only mounted when an attempt is made to access a file under the directory, or using cd to enter the directory structure.

  • HPC no longer uses /archive folders. Rather, aged data stored on /shared will be moved to a long-term tier on the backend. This process is automatic and invisible to you.

  • If you are in need of more space, you can try creating compressed archives (i.e. “tarballs”) of large folders using a command similar totar -zcvf compressedFileName.tar.gz folderToCompress. You can then search for files within the tarball using tar -tzvf compressedFileName.tar.gz. (See the tar man page with the man tar command).

Long Term Data Storage

Once data is no longer needed for computation, it should be transferred off of /scratch to a permanent data storage location under /shared. Do not use the scratch file system (/scratch) for long-term storage; it is optimized for fast parallel access from multiple computers, and is too scarce and too expensive for long-term storage.

If you need to back up data to an external location, you can transfer it in two ways. The faster way uses the standard Unix utilities (such as cp, tar, etc) run on the HPC nodes, and is suitable for small transfers. The best way to transfer larger data sets uses the Globus service. Globus performance depends upon system traffic and network performance, however jobs are persistent after disconnecting which makes it more suitable for large transfers.

For information on how to best organize your backups, see our page on Backing Up Your Data.