Monday, June 23, 2014

Flux Publications Search

The Associate Vice President Dr. Eric Michielssen, is compiling a report of work supported by Advanced Research Computing, who in turn provides Flux:
[We] are asking investigators to submit a list of journal and conference publications from the past 18 months that involved the use of the Flux computing cluster. Please submit your publications (and any associated images, if available) by emailing Commmunications Specialist Dan Meisler (dmeisler@umich.edu). Don’t worry about formatting — cutting and pasting from another document is fine. Also, while we encourage researchers to acknowledge Flux in publications (see http://bit.ly/citing-flux), specific mention of Flux in your papers is not necessary in responding to this request.
Request for ARC supported publications. 

If Flux was useful in your work please help support us by passing your publications on to ARC above.

Thanks!
Flux Support Staff

Thursday, June 19, 2014

XSEDE Proposal Period June 15 - July 15

The next XSEDE proposal period is currently open. ARC has a page about XSEDE which is a set of national HPC resources that researchers can get time on at no cost.

Those who are interested and want to learn more can contact hpc-support@umich.edu where you will contact Brock the Campus Champion for XSEDE at Michigan. You can also watch this video about writing a good proposal.

There are three types of XSEDE Allocations:

  1. Startup, up-to 200,000 hours, doesn't require a formal proposal and are awarded on an ongoing basis.
  2. Research or XRAC, can be any number of hours, requires formal proposal. There are four times a year that these are due and awarded.  The current submission period is June 15th - July 15th, and will start Oct 1st.
  3. Education, these are treated like startup but are for support of corse work.  If you would like to use HPC in your course work contact us, and we can get XSEDE time for your class at no cost.

Wednesday, June 18, 2014

Brief Nyx/Flux Scheduler Maintenance, June 22, 22:00 EDT

The scheduler for Nyx/Flux will be offline for approximately 30 minutes starting at 10pm EDT Sunday, June 22.

** Running jobs will not be affected **
** Queued jobs will not be lost **

However, during this time the "q" commands will be unavailable (qsub, qdel, qpeek, etc).

During the maintenance HPC engineers will install additional memory into nyx.engin.umich.edu, the primary scheduler for the cluster.  The result should be improved performance and reliability.

Tuesday, June 10, 2014

Nyx/Flux Summer 2014 Outage

Nyx, Flux, and their storage systems (/home, /home2, /nobackup, and /scratch) will be unavailable starting at 6pm Saturday August 23rd , returning to service on Friday August 29th.

During this time, the following updates have been planned:
* /nobackup access will be removed from Flux, but will still be available to Nyx accounts submitting jobs to the queue 'cac'.  All Flux users currently using /nobackup on Flux should transition their workflows from /nobackup to /scratch/{accountname_flux}/
* The technology-preview GPUs that were available for any Flux account will be removed from service.   Users with GPU-related jobs should transition to the new Fluxg resource
* Software updates including the Operating System (minor updates provided by RedHat), job scheduling system, Globus Connect, and the lustre-based filesystem /scratch
* Hardware maintenance and upgrades on some infrastructure hosts and network
* Flux InfiniBand network routing/architecture improvements
* ITS Facilities will be doing annual power maintenance on the Modular Data Center and ITS Storage will be performing updates to the HPC Value Storage volumes
* Working with UMnet to move Flux to a 100 gigabit backbone connection if available at that time

We will post status updates on our Twitter feed which can also be found on the CAEN HPC website .

Thursday, June 5, 2014

Flux Adds GPU-Attached Nodes

We're happy to announce the availability of the Flux GPU machines (fluxg). Currently, there are 5 16-core machines each with 8 NVIDIA K20X GPUs. The Flux GPU allocations are only different in that the atomic unit for an allocation includes 2 processors, 8GB of memory, and 1 GPU.  For more information, see the ARC website.

Purchasing an allocation for fluxg is like any other - send a ticket to flux-support@umich.edu requesting how many GPUs you would like.   The costs per GPU can be found on the same ARC webpage.

Assuming you have access to a Flux GPU Account, you would access it with (using the LSA GPU Flux account as an example):

#PBS -A lsa_fluxg
#PBS -l qos=flux
#PBS -q fluxg
To get started, here are some links for accessing the CUDA libraries with Modulesusing GPUs,  NVIDIA's CUDA page, using Matlab with GPUs,  or the Theano Python library.

As always, email Flux Support with any questions.

One HP SL270 with 8 Nvidia K20X Cards

Monday, June 2, 2014

Flux Flop Rate Summer 2014

Every so often we get requests for "How fast is Flux"?  We do have an idea, but for technical and historical reasons there is no Top500 run for flux. Based on the most recent list (November 2013) Flux would easily fall in the top 200 machines in the world with an Rpeak of 302 TFlop/s.

Computers are normally measured in Flops a measure of how many adds/multiples etc. a system could reach per-second on floating point numbers. In scientific computing we normally are interested in Double Precision numbers. In general if you are using Single Precision, or Floats, performance and available memory will be double.  This isn't the same in all cases, eg. see the Nvidia Tesla K10 (GK104, 4,580 SP GFlops, 190 DP GFlops).

So how fast is each part of Flux:

Purchasenode ct.core/nodecoresclock GhzDP flop/hzDP GFlops
flux11711220522.67421,915
flux21691220282.67421,659
flux31681220162.67421,531
flux41241619842.6841,267
flux51241619842.6841,267
flux61442028802.8864,512
Private 2602012002.8826,880
Private 312202402.885,376
Private-phi188,088
Private 11361621762.6845,261
fluxm15402002.2741,816
fluxm25321602.483,072
flux-g (k20x)5852,400
flux Phi188,088
Total16920TOTAL302,644  

Highlights:
  • Anything in Italics is entering service, and is not yet available
  • The highlighted elements are accelerators (GPU's or Phi's)
  • The 40 K20x GPU's in FluxG are faster than Flux1 and Flux2 combined, at %9 the cost
  • Machines marked Private are part of FOE
  • Machines flux4 or newer support the AVX instruction, which doubled the performance of vectorized codes.