Storage Services in the Protected Environment
The Center for High Performance Computing (CHPC) offers four types of encrypted storage within the Protected Environment (PE) based on your project's needs: home directories, project space, scratch file systems, and an archive storage system.
See the Data Transfer Services page for information on moving data to and from the CHPC PE storage.
| Please remember that you should always have a additional copies of any critical data on independent storage systems. While storage systems built with data resiliency mechanisms (such as RAID and erasure coding mentioned in the offerings listed below or other, similar technologies) allow for multiple component failures, they do not offer any protection against large-scale hardware failures, software failures leading to corruption, or the accidental deletion or overwriting of data. Please take the necessary steps to protect your data to the level you deem necessary. |
On this page
The table of contents requires JavaScript to load.
Home Directories
The CHPC provides everyone in the Protected Environment (PE) with a free 50GB home directory. This space is backed up; for details on the backup schedule, see 3.1 File Storage Policies.
The CHPC does not offer larger home directories in the PE. Instead, users should make use of project spaces to store data.
The 50GB cap on every home directory space is enforced with a two-level quota: a soft quota of 50 GB, which gives you a maximum of seven days to clean up your home directory under 50GB, and a hard quota of 75 GB, which prevents any write-access to your home directory until it is under the 50GB cap.
When over quota, you will not be able to start a FastX or Open OnDemand session. An
SSH session can be used to connect to the CHPC and clean up your home directory. To
find which files are taking up space, use the command ncdu. |
Project Storage
The CHPC provides project space for groups needing to store project-specific research data that is sensitive in nature. There are different tiers of storage offerings*, depending upon the needs of your project:
- 250Gb, provided for free. If backups are required, it is a single charge of $75
- By the TB, without automatic backups, at a rate of $150/Tb
- By the TB, with automatic backups, at a rate of $450/Tb
*A single purchase of storage is good for 5 years.
For details on the current backup policy of the PE project space, see 3.1 File Storage Policies.
Access to project space is controlled such that only people that are part of the project are allowed access to the space. Only the project PI, or their designated delegates, can add or remove persons from their CHPC-hosted projects.
| For IRB-governed projects, the persons given access must also be listed as study personnel on the IRB record. |
Project space is only intended for storing data and data outputs, not for handling I/O from computational jobs. All computational jobs making use of data stored in project space should make use of the scratch space instead for the duration of a job. Methods for utilizing the scratch space are described here.
Purchasing Project Storage
If your project is already hosted in the CHPC PE, you can request additional storage (and backups) by filling out the storage request form in Portal. When submitting the request, please indicate the project this is in reference to.
If your project is new to the CHPC and does not yet have any project storage, please fill out a new project request form in Portal and let us know what your storage requirements are in that request.
Scratch File Systems
The CHPC provides a high-performance scratch space that is freely available to everyone with accounts in the Protected Environment (PE). There are two scratch file systems available:
- /scratch/general/pe-nfs1, a 280 TB NFS system accessible from all PE resources
- There is a per-user quota of 100TB on this scratch file system
- /scratch/general/pevast, a 100 TB flash-based file system available from all PEresources
- There is a per-user quota of 10 TB on this scratch file system
| Scratch space is not intended for long-term file storage and, as such files in scratch spaces are deleted automatically after a period of inactivity. The scratch file systems are not backed up, so please plan accordingly. |
It is recommended to use the scratch space for the duration of all computational jobs. Data should be transferred from the project to scratch spaces when running jobs, as the scratch systems are designed for better performance and this prevents project spaces from becoming overwhelmed.
If you have questions about using the scratch file systems or IO-intensive jobs, please contact the CHPC at helpdesk@chpc.utah.edu.
Temporary File Systems
/scratch/local
/scratch/local is beneficial due to lower latency in I/O processing. Each node on the cluster has a local disk mounted at /scratch/local that can be used for storing intermediate files during calculation for the duration of a job.
The CHPC prevents writing in the top-level /scratch/local directory. Instead, when you submit a job, our systems automatically create a directory on the node at /scratch/local/$USER/$SLURM_JOB_ID. Only the job owner can access this directory.
At the end of the job, /scratch/local/$USER/$SLURM_JOB_ID is automatically removed. Files in /scratch/local/$USER/$SLURM_JOB_ID that are required after job completion should be moved to another file system (i.e. home, group, scratch) before the end of the job.
| /scratch/local is software-encrypted. |
/tmp and /var/tmp
Linux defines temporary file systems at /tmp or /var/tmp. CHPC cluster nodes set up these temporary file systems as a RAM disk with limited capacity.
The CHPC recommends not using /tmp or /var/tmp as your TMPDIR. Instead, it is advantageous to define the location of the temporary storage by setting the environmental variable TMPDIR to point to /scratch/local. Local disk drives (i.e. /scratch/local) range from 40 to 500 GB depending on the node, which is much more than the default /tmp or /var/tmp size.
Archive Storage: Elm
The CHPC offers an archive storage solution based around object storage called Ceph, a distributed object store suite developed at UC Santa Cruz. If interested, a more detailed description of this storage offering is available.
This space is a standalone entity that is not mounted on other CHPC PE resources. Instead, the CHPCOne key feature of the archive system is that users can manage their data archive directly. Ceph presents the storage as a S3 endpoint which allows the archive storage solution to be accessed via applications that use Amazon’s S3 API, such as s3cmd and rclone.
The CHPC can provide your group archive storage at a rate of $150/TB. Archive storage is a single purchase that is good for 5 years.
Elm is currently the backend storage used for CHPC-provided automatic backups (e.g., backed-up project or home space); as such, groups looking for additional data resiliency that already have spaces backed up by the CHPC may want to look for other options.
User-Driven Backup Options
It is always recommended to have multiple copies of critical data held across multiple independent systems. The University of Utah provides additional methods of storing HIPAA-protected data. Campus-level options for a backup location include Box and Microsoft OneDrive.
| There is a UIT Knowledge Base article with information on the suitability of the campus level options for different types of data (public/sensitive/restricted). Please follow these university guidelines to determine a suitable location for your data. |
Another option for backup includes the CHPC archive storage system, Elm. If you choose to use a method for storing sensitive or protected data outside of the CHPC or other University-approved locations, it is your responsibility to ensure that the data is stored in an appropriate location.
If you are considering a user driven backup option for your data, CHPC staff are available for consultation at helpdesk@chpc.utah.edu.
There are a number of tools that can be used to transfer data for backup. Rclone is the tool best suited for file transfers to object storage file systems. Other tools include fpsync, a parallel version of rsync suited for transfers between typical Linux "POSIX-like" file systems, and Globus, best suited for transfers to and from resources outside of the CHPC.
Additional Information
For more information on CHPC data policies, visit the File Storage Policies page.