Saturday, January 10, 2015

Flux Adds New 20 core Nodes

Flux has been expanded to include 126 new nodes. These are IBM* NeXtScale based systems.  Each Chassis holds 12 nx360 M4 nodes.

Details are:

  • 2 x 10 core 2.8 Ghz Intel E5-2680v2
  • 90+ GB of DDR 1866Mhz RAM
  • FDR** Infiniband
A clever eye may notice that 126 nodes does not evenly divide into chassis that hold 12 nodes. Under FoE we operate some additional nodes totaling 156.

*IBM has sense sold their x86 line to Linovo
** While the servers have FDR adaptors, the fabric it connects to is still QDR based and will perform as such.

Using Infiniband with MATLAB Parallel Computing Toolbox

In High Performance Computing (HPC) there are a number of network types commonly used, among these are: Ethernet, the common network found on all computer equipment. Infiniband, a specialty high performance low latency interconnect common on commodity clusters.  There are also several propriety types and a few other less common types but I will focus on Ethernet and Infiniband.

Ethernet and really its mate protocol, TCP, are the most common supported MPI networks.  Almost all computer platforms support this network type and can be as simple as using your home network switch.  It is ubiquitous and easy to support.  Networks like Infiniband though require special drivers, uncommon hardware but the effort is normally worth it.

The MATLAB Parallel Computing Toolbox provides a collection of functions that allow users of MATLAB to utilize multiple compute nodes to work on larger problems.  Many may not realize that MathWorks chose to use the standard MPI routines to implement this toolbox.  MathWorks also chose for ease of use to ship MATLAB with the Mpich MPI library, and the version they use only support Ethernet for communication between nodes.

As noted Ethernet is about the slowest common network used in parallel applications. The question is how much can this impact performance.

Mmmmm Data:

The data was generated on 12 nodes of Xeon x5650 total 144 cores. The code was the stock MATLAB paralleldemo_backslash_bench(1.25) from MATLAB 2013b.  You can find my M-Code at Gist.

The data show two trends, one is independent of the network type.  That is many parallel algorithms do not scale unless the amount of data for each core to work on is sufficiently large. In this case for Ethernet especially the peak performance is never reached.  What should be really noted though is that without Infiniband at many problem sizes over half of the performance of the nodes is lost. The second trend is that network really matters.

How to have MATLAB use Infiniband?

MathWorks does not ship an MPI library with the parallel computing toolbox that can use infiniband by default. This is reasonable, I would be curious how large the average PCT cluster is, and/or how big the jobs ran on the toolbox are.  Lucky for us MathWorks allows a way for introducing your own MPI library.  Let me be the first to proclaim:
Thank you MathWorks for adding mpiLibConf.m as a feature. -- Brock Palen
In the above test we used Intel MPI for the infiniband test and mpich for the ethernet test.  The choice of MPI is important.  The MPI standard enforces a shared API but not a shared ABI.  Thus the MPI library you substitute needs to match the one MATLAB is compiled against. Lucky for us they used mpich, so any mpich clone should work; mvapich, IntelMPI, etc.

If you are using the MATLAB Parallel Computing Toolbox on more than one node, and if your cluster has a network other than Ethernet/TCP (there are non-tcp ethernet networks that perform very well) I highly encourage that the effort be put in to ensure you use that network.

For Flux users we have this setup, but you have to do some setup for yourself before you see the benefit.  Please visit the ARC MATLAB documentation, or send us a question at hpc-support@umich.edu.

Friday, January 2, 2015

Q1 XSEDE Research Proposal Deadline

The next XSEDE Research proposal deadline is January 15th.  If you are looking to get more work done, or to scale to new levels read more on the ARC site.