In the past, we would run commands via scripts and parse the output and make graphs.
The most recent versions of the cluster management software present some (and increasingly more) of the information via a REST-ful interface that returns JSON-formatted results.
In addition, JavaScript graphing libraries are improving in usefulness and usability. Among these are d3.js, JavaScript InfoVis Toolkit, Chart.js, Google Charts and others.
Our current usage graphs (an example of which is below) do not differentiate different types of Flux products (regular nodes, larger memory nodes, GPU nodes, FOE nodes, etc.) and do not separate utilization by Flux project accounts or by Flux user accounts.
Figure 1: The current Flux usage graphs do not differentiate between Flux projects, do not offer different time scales, and are generally of limited use. |
- an overview page that improves on the current usage graphs
- a way to see daily, weekly, monthly, and yearly detail
- a way to see Flux products (as above) individually and stacked together
- a place for this to live (locally? MiServer? AWS? Google Sites?)
- a web site akin to http://flux-stats/?account-flux(|m|g) that provides per-Flux-project reports including:
- allocated cores over time
- running cores (by user) over time
- current resources in use per total:
- xrunning / xallocated cores
- yrunning / yallocated GB RAM
- the current queue represented as running jobs:
job owner # cores in use # GB RAM in use times (start, running, total) job name job ID
job owner # cores req’d # GB RAM req’d time req’d job name job ID - some heuristic advice along the lines of:
- if you had X more cores, then Y more jobs would start
- if you had A more GB of RAM, then B more jobs would start
- “you would save money by switching your the allocations in project G from standard Flux to larger memory Flux”, etc.
Email us at coe-hpc-jobs@umich.edu if you are interested.