Hi all
There are lots of queued jobs on jb01. The node allocator is not assigning nodes for them because of the "rules". It needs to keep enough nodes back to guarantee Rmin for all queues. Given the number of remaining nodes and the fact that jobs are queued in multiple queues it is allocating nodes to each queue but because all queued jobs are larger than the number of allocated nodes it is not assigning/moving nodes to the queues because jobs will not be able to use them.
I wanted to highlight this to help everyone understand why jobs are not running.
Adjusting the R* values so they are more appropriate for the number of queues and 120 nodes will help. If smaller jobs are submitted they will run. The node allocator log shows how many nodes are available for each queue to give an idea of largest job size that could run in the current scenario.
Here's a snapshot from the last cycle:
12/22/2016 03:11:17 Nodes assigned to each queue
12/22/2016 03:11:17 bf1: 2
12/22/2016 03:11:17 bf2: 4
12/22/2016 03:11:17 bf3: 8
12/22/2016 03:11:17 bf4: 17
12/22/2016 03:11:17 fs1: 2
12/22/2016 03:11:17 fs2: 4
12/22/2016 03:11:17 fs3: 8
12/22/2016 03:11:17 fs4: 17
12/22/2016 03:11:17 ps1: 2
12/22/2016 03:11:17 ps2: 4
12/22/2016 03:11:17 ps3: 9
12/22/2016 03:11:17 ps4: 19
The node allocator log files are in /var/spool/PBS/node_allocator_log
I hope this helps