Computers are normally measured in Flops a measure of how many adds/multiples etc. a system could reach per-second on floating point numbers. In scientific computing we normally are interested in Double Precision numbers. In general if you are using Single Precision, or Floats, performance and available memory will be double. This isn't the same in all cases, eg. see the Nvidia Tesla K10 (GK104, 4,580 SP GFlops, 190 DP GFlops).
So how fast is each part of Flux:
Purchase | node ct. | core/node | cores | clock Ghz | DP flop/hz | DP GFlops |
flux1 | 171 | 12 | 2052 | 2.67 | 4 | 21,915 |
flux2 | 169 | 12 | 2028 | 2.67 | 4 | 21,659 |
flux3 | 168 | 12 | 2016 | 2.67 | 4 | 21,531 |
flux4 | 124 | 16 | 1984 | 2.6 | 8 | 41,267 |
flux5 | 124 | 16 | 1984 | 2.6 | 8 | 41,267 |
flux6 | 144 | 20 | 2880 | 2.8 | 8 | 64,512 |
Private 2 | 60 | 20 | 1200 | 2.8 | 8 | 26,880 |
Private 3 | 12 | 20 | 240 | 2.8 | 8 | 5,376 |
Private-phi | 1 | 8 | 8,088 | |||
Private 1 | 136 | 16 | 2176 | 2.6 | 8 | 45,261 |
fluxm1 | 5 | 40 | 200 | 2.27 | 4 | 1,816 |
fluxm2 | 5 | 32 | 160 | 2.4 | 8 | 3,072 |
flux-g (k20x) | 5 | 8 | 52,400 | |||
flux Phi | 1 | 8 | 8,088 | |||
Total | 16920 | TOTAL | 302,644 |
Highlights:
- Anything in Italics is entering service, and is not yet available
- The highlighted elements are accelerators (GPU's or Phi's)
- The 40 K20x GPU's in FluxG are faster than Flux1 and Flux2 combined, at %9 the cost
- Machines marked Private are part of FOE,
- Machines flux4 or newer support the AVX instruction, which doubled the performance of vectorized codes.