The HPC team is updating this page. Check back for new information.
Quarterly Planned Downtime
Storrs HPC conducts cluster maintenance on a regular schedule throughout the calendar year. During the days listed below, the cluster will be scheduled for downtime to conduct these tasks. Downtime begins at 7AM and is planned to last 8 hours, although the duration may vary.
Maintenance tasks include, but are not limited to:
· Firmware updates
· Kernel updates
· Network maintenance
· Major application reconfiguration
· Data Center operations
· Major hardware repair, reconfiguration, or installation
Detailed messages will be sent out via the STORRS-HPC_L mailing list at 14 days and 7 days out. Please make sure you have this sender added to any email whitelists to avoid being marked as Spam. If you need to be added to the list, please send a request to hpc@uconn.edu.
Regularly scheduled maintenance days are listed below:
Scheduled Maintenance Day |
---|
Third Tuesday of February |
Third Tuesday of May |
Third Tuesday of August |
Third Tuesday of November |
Weekly Planned Maintenance
Weekly maintenance does not require downtime. Jobs typically continue to run and the scheduler remains available, thus no advance notice is provided, except under special circumstances. Tasks include, but are not limited to:
· OS security updates
· Minor configuration changes
· Minor hardware repairs
The following tasks are performed on a regular basis:
Login nodes reboot and update every Wednesday at 6:00 am EDT/EST.
OSG nodes reboot and update every other Thursday at 10:30 am EDT/EST.
25% of the compute and GPU nodes are scheduled for reboots and updates every other Tuesday and Thursday and reboot as jobs finish. (Effectively every node is scheduled to reboot every 2 weeks.)
Example calendar:
Sunday | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday |
---|---|---|---|---|---|---|
1/4 cluster scheduled reboots | Login Nodes reboot | 1/4 cluster scheduled reboots; OSG nodes reboot | ||||
1/4 cluster scheduled reboots | Login Nodes reboot | 1/4 cluster scheduled reboots | ||||
Repeat forever |