Flux HPC: Flux Flop Rate Summer 2014

Every so often we get requests for "How fast is Flux"? We do have an idea, but for technical and historical reasons there is no Top500 run for flux. Based on the most recent list (November 2013) Flux would easily fall in the top 200 machines in the world with an Rpeak of 302 TFlop/s.

Computers are normally measured in Flops a measure of how many adds/multiples etc. a system could reach per-second on floating point numbers. In scientific computing we normally are interested in Double Precision numbers. In general if you are using Single Precision, or Floats, performance and available memory will be double. This isn't the same in all cases, eg. see the Nvidia Tesla K10 (GK104, 4,580 SP GFlops, 190 DP GFlops).

So how fast is each part of Flux:

Purchase	node ct.	core/node	cores	clock Ghz	DP flop/hz	DP GFlops
flux1	171	12	2052	2.67	4	21,915
flux2	169	12	2028	2.67	4	21,659
flux3	168	12	2016	2.67	4	21,531
flux4	124	16	1984	2.6	8	41,267
flux5	124	16	1984	2.6	8	41,267
flux6	144	20	2880	2.8	8	64,512

Private 2	60	20	1200	2.8	8	26,880
Private 3	12	20	240	2.8	8	5,376
Private-phi	1	8				8,088
Private 1	136	16	2176	2.6	8	45,261
fluxm1	5	40	200	2.27	4	1,816
fluxm2	5	32	160	2.4	8	3,072
flux-g (k20x)	5	8				52,400
flux Phi	1	8				8,088

		Total	16920		TOTAL	302,644

Highlights:

Anything in Italics is entering service, and is not yet available
The highlighted elements are accelerators (GPU's or Phi's)
The 40 K20x GPU's in FluxG are faster than Flux1 and Flux2 combined, at %9 the cost
Machines marked Private are part of FOE,
Machines flux4 or newer support the AVX instruction, which doubled the performance of vectorized codes.

Monday, June 2, 2014

Flux Flop Rate Summer 2014