Institutional allocations and job scheduling/preemption

Created by Robert Harrison, Modified on Thu, 17 Apr at 8:11 AM by Robert Harrison

Changes to be made 2/10/24:  (1)The number of nodes in the priority partition reduced from 12 to 6 to provide more access for non-priority users and because most jobs are only using 1 node. (2) Fair-share scheduling enabled on all partitions again for more equitable access.


Each member institution of the Empire AI consortium has an equal allocation of time, but there are many possible ways of accomplishing this.  Since Alpha was loaned as a gift to the consortium we have some additional flexibility and are using it as an opportunity to explore approaches to scheduling for adoption on future systems.  


The current approach is that each week users from one institution have priority access to 6 of the 13 compute nodes (the other 7 nodes remain accessible to all). 

  • Jobs from the priority institution will be started on the dedicated  ahead of jobs from other institutions.
  • A running job from a non-priority institution will be preempted (cancelled and requeued) if a priority job needs its resources.  This happens after a delay of a few minutes.
  • Jobs from non-priority institutions will be started if resources are available and scheduled using a fair-share algorithm.
  • If you don't want a preempted job to be requeued, please submit it with the --no-requeue SLURM option.


In addition to ensuring each institution can access its share of the resources, this schedule aims to avoid long wait times for large jobs for the priority institution while also trying to minimize under utilized resources.


The priority institution is selected at 00:00am on Mondays round robin from the following list (alphabetic order) 


InstitutionWeek startingWeek startingWeek startingWeek starting

CUNY

12/16/2401/27/2503/10/254/28/25

NYU

12/23/2402/03/2503/17/255/05/25
Columbia12/30/2402/10/2503/24/255/12/25
Cornell
01/06/2502/17/2503/21/255/19/25
RPI01/13/2502/24/2504/07/255/26/25
SUNY01/20/2503/03/2504/14/254/21/25



Note: Due to an extended outage the week of 04/14/25 (SUNY priority), the following week (starting 4/21/25) will also be a SUNY priority week.



Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons

Feedback sent

We appreciate your effort and will try to fix the article