Data Storage Guide
The HPC team is updating this page. Check back for new information.
HPC Storage (short term)
The Storrs HPC cluster has a number of local high performance data storage options available for use during job execution and for the short term storage of job results. None of the cluster storage options listed in the table below should be considered permanent, and should not be used for long term archival of data. Please see the next section below for permanent data storage options that offer greater resiliency.
Name | Path | Default Quota | Relative Performance | Persistence | Backed up? | Purpose |
---|---|---|---|---|---|---|
Scratch |
| 1TB Per PI, shared with associated users | Fastest | None, deleted after 60 days | No | Fast parallel storage for use during computation |
Home |
| 300GB | Fast | Yes | Snapshots, daily and stored for 30 days | Personal storage, available on every node |
Group |
| 1TB per PI, shared with associated users; expandable by request | Fast | Yes | Snapshots, daily and stored for 30 days | By Request Long term group storage for collaborative work |
Notes
Data deletion inside the
/scratch
folder is based on file modification time.Scratch data is transient and is purged after 60 days. Once data is no longer needed for computation, it should be immediately transferred to
/shared
. Do not use the scratch file system (/scratch) for long-term storage.Certain directories are only mounted on demand by
autofs
. These directories are:/home
and/shared
. If you try to use shell commands likels
on these directories they may fail. They are only mounted when an attempt is made to access a file under the directory, or usingcd
to enter the directory structure.HPC no longer uses
/archive
folders. Rather, aged data stored on/shared
will be moved to a long-term tier on the backend. This process is automatic and invisible to you.If you are in need of more space, you can try creating compressed archives (i.e. “tarballs”) of large folders using a command similar to
tar -zcvf compressedFileName.tar.gz folderToCompress
. You can then search for files within the tarball usingtar -tzvf compressedFileName.tar.gz
. (See the tar man page with theman tar
command).
Long Term Data Storage
Once data is no longer needed for computation, it should be transferred off of /scratch
to a permanent data storage location under /shared
. Do not use the scratch file system (/scratch) for long-term storage; it is optimized for fast parallel access from multiple computers, and is too scarce and too expensive for long-term storage.
If you need to back up data to an external location, you can transfer it in two ways. The faster way uses the standard Unix utilities (such as cp, tar, etc) run on the HPC nodes, and is suitable for small transfers. The best way to transfer larger data sets uses the Globus service. Globus performance depends upon system traffic and network performance, however jobs are persistent after disconnecting which makes it more suitable for large transfers.
For information on how to best organize your backups, see our page on Backing Up Your Data.