Service Alert: Scheduled Alpha Cluster Downtime 3/23/26 **COMPLETED**

Posted 14 days ago by Cesar Arias

  • Pinned Topic
C
Cesar Arias Admin

:rotating_light:Status Update: Alpha cluster maintenance has been extended. The team is completing additional networking changes necessary for the future Beta rollout. We’ll post another update as soon as the cluster is back online.


Scheduled Maintenance: Alpha Cluster Full System Shutdown

Nature of Work: Power event to complete remaining power work from 3/10.

Impact Level: Full Cluster Shutdown (Compute, Login, and Storage Offline).


Maintenance Schedule

  • Starts: Monday, March 23 @ 8:00 AM

  • Ends: Monday, March 23 @ 6:00 PM


Job Management

To ensure a smooth transition and protect your active research, we are utilizing Slurm’s automated scheduling tools:

  • Reservations: A system wide maintenance reservation will be in place for the duration of the event. If you submit a job that requires more time than is available before 8:00 AM on 3/23, the scheduler will automatically hold it in a pending state.

  • Resume Policy: Once the cluster is powered back on and the system is verified, all pending and held jobs will automatically resume their position in the queue. Any jobs active at the start of the window will be requeued to restart automatically from the beginning.


User Impact

The entire Alpha cluster, including login nodes, compute nodes, and storage, will be powered down for the duration of this work. Access will be fully restricted. Please ensure all interactive work is saved and closed well before the 8:00 AM deadline on Monday, 3/23.

0 Votes


0 Comments

Login to post a comment