On Thursday, December 5th, 2013 Steve Elliot and KD Singh of Amazon will be at the Hilton Garden Inn Ann Arbor (1401 Briarwood Circle, Ann Arbor, MI 48108).
Steve and KD will be talking about about both compute and storage services for researchers. This will be a technical discussion and presentation including live demos of Gluster, StarCluster among other technologies, possibly including Elastic Map Reduce or RedShift (AWS' data warehouse service).
To register, please email Steve Elliott at: elliotts@amazon.com.
News and updates about the Flux high performance computing cluster at the University of Michigan.
Tuesday, November 26, 2013
Thursday, November 14, 2013
Flux for Research Administrators
About This Talk
This talk was given at the 2013 University of Michigan CyberInfrastructure Days conference.
Administrative and related support activities are needed for researchers to successfully plan for and use Flux in their projects.
This presentation describes Flux and how the use of Flux by researchers is planned for, acquired, and managed.
The information presented here is intended to help you to better support the proposal or other planning process and manage or track Flux use.
Administrative and related support activities are needed for researchers to successfully plan for and use Flux in their projects.
This presentation describes Flux and how the use of Flux by researchers is planned for, acquired, and managed.
The information presented here is intended to help you to better support the proposal or other planning process and manage or track Flux use.
What is Flux in Terms of Hardware?
Flux is a rate-based service that provides a Linux-based High
Performance Computing (HPC) system to the University of Michigan
community.
It is a fast system. Its CPUs, internal network, and storage are all fast in their own right and are designed to be fast together.
It is a large system on campus. Flux consists of 12,260 cores.
It is a fast system. Its CPUs, internal network, and storage are all fast in their own right and are designed to be fast together.
It is a large system on campus. Flux consists of 12,260 cores.
Flux Services and Costs
monthly rate | |
---|---|
standard Flux | $11.72/core |
larger memory Flux | $23.82/core |
Flux Operating Env. | $113.25/node |
GPU Flux | $107.10/GPU |
Planning to Use Flux
Planning for using Flux is done by estimating usage needs and considering
the limits or availability of funding.
Using Flux is more flexible than purchasing hardware. Allocations can be adjusted up or down or kept the same over the duration of a project.
There are two approaches to planning for the use of Flux:
Using Flux is more flexible than purchasing hardware. Allocations can be adjusted up or down or kept the same over the duration of a project.
There are two approaches to planning for the use of Flux:
- Determine the amount of Flux resources your research will need and create a budget to meet that demand.
- Determine how much Flux time and cores you can afford on a given budget.
Understanding Flux Allocations, Accounts, \\ and Projects is Important
A Flux project is a collection of Flux user accounts that are associated
with one or more Flux allocations.
A Flux project can have as many allocations as you wish.
A Flux project can have as many allocations as you wish.
Instructions for Research and Other Administrators \\ During the Planning Process
Administrators should confirm, as necessary, that the grant writer has
done what he or she needs to do.
Grant writers need to make sure their computing needs are suitable for the use of Flux, estimate the Flux resources that are required for the project, describe Flux in the proposal, and prepare the information needed to complete the Flux PAF Supplement form.
The administrator sends the completed Flux PAF Supplement to coe-flux-paf-review@umich.edu, and attaches the returned and approved Flux PAF Supplement to the PAF packet.
Grant writers need to make sure their computing needs are suitable for the use of Flux, estimate the Flux resources that are required for the project, describe Flux in the proposal, and prepare the information needed to complete the Flux PAF Supplement form.
The administrator sends the completed Flux PAF Supplement to coe-flux-paf-review@umich.edu, and attaches the returned and approved Flux PAF Supplement to the PAF packet.
The Flux PAF Supplement
The completion and the review of the Flux PAF Supplement are important steps in
the Flux planning process.
Being able to fill out the Flux PAF Supplement is a good self-check for having completed a good planning process.
The review of the Flux PAF Supplement allows the Flux operators to do some system planning. In some cases you may be asked for some clarification.
Being able to fill out the Flux PAF Supplement is a good self-check for having completed a good planning process.
The review of the Flux PAF Supplement allows the Flux operators to do some system planning. In some cases you may be asked for some clarification.
Using Flux
A Flux User Account and a Flux Allocation are needed to use Flux.
A Flux user account is a Linux login ID and password (the same as your U-M uniqname and UMICH.EDU password).
Flux user accounts and allocations may be requested using email. (See http://arc.research.umich.edu/flux/managing-a-flux-project/)
A Flux user account is a Linux login ID and password (the same as your U-M uniqname and UMICH.EDU password).
Flux user accounts and allocations may be requested using email. (See http://arc.research.umich.edu/flux/managing-a-flux-project/)
Monitoring and Tracking Flux Allocations
Historical usage data for Flux allocations is available in MReports
(http://mreports.umich.edu/mreports/pages/Flux.aspx).
Instructions for accessing data in MRreports are available online (http://arc.research.umich.edu/flux/managing-a-flux-project/check-my-flux-allocation/).
Billing is done monthly by ITS.
Flux allocations can be started and ended (on month boundaries). Multiple allocations may be created.
Instructions for accessing data in MRreports are available online (http://arc.research.umich.edu/flux/managing-a-flux-project/check-my-flux-allocation/).
Billing is done monthly by ITS.
Flux allocations can be started and ended (on month boundaries). Multiple allocations may be created.
More Information is Available
Email
Look at CAEN's High Performance Computing website: http://caen.engin.umich.edu/hpc/overview.
Look at ARC's Flux website: http://arc.research.umich.edu/flux/
hpc-support@umich.edu
.
Look at CAEN's High Performance Computing website: http://caen.engin.umich.edu/hpc/overview.
Look at ARC's Flux website: http://arc.research.umich.edu/flux/
Wednesday, November 13, 2013
Flux: The State of the Cluster
Last Year
What is Flux in Terms of Hardware?
Flux is a rate-based service that provides a Linux-based High
Performance Computing (HPC) system to the University of Michigan
community.
It is a fast system. Its CPUs, internal network, and storage are all fast in their own right and are designed to be fast together.
It is a large system on campus. Flux consists of 12,260 cores.
It is a fast system. Its CPUs, internal network, and storage are all fast in their own right and are designed to be fast together.
It is a large system on campus. Flux consists of 12,260 cores.
Flux was moved to the Modular Data Center from the MACC
Moving Flux to the MDC from the MACC resulted directly in the
decrease in the rate and an accompanying change in service level.
Before the move Flux had generator-backed electrical power and could run for days during a utility power outage.
After the move Flux has battery-backed electrical power and can run for 5 minutes during a utility power outage.
Before the move Flux had generator-backed electrical power and could run for days during a utility power outage.
After the move Flux has battery-backed electrical power and can run for 5 minutes during a utility power outage.
The rate for all of the Flux services was reduced on October 1, 2013
old monthly rate | new monthly rate | |
---|---|---|
standard Flux | $18.00/core | $11.72/core |
larger memory Flux | $24.35/core | $23.82/core |
Flux Operating Env. | $267.00/node | $113.25/node |
GPU Flux | n/a | $107.10/GPU |
Flux has the newest GPUs from NVIDIA - the K20x
Flux has 40 K20x GPUs connected to 5 compute nodes.
Each GPU allocation comes with 2 compute cores and 8GB of CPU RAM.
Each GPU allocation comes with 2 compute cores and 8GB of CPU RAM.
Number and Type of GPU | one Kepler GK110 |
Peak double precision floating point perf. | 1.31 Tflops |
Peak single precision floating point perf. | 3.95 Tflops |
Memory bandwidth (ECC off) | 250 GB/sec |
Memory size (GDDR5) | 6 GB |
CUDA cores | 2688 |
Flux has Intel Phis as a technology preview
Flux has 8 Intel 5110P Phi co-processors connected to one compute
node.
As a technology preview, there is no cost to use the Phis.
As a technology preview, there is no cost to use the Phis.
Number and type of processor | one 5110P |
Processor clock | 1.053GHz |
Memory bandwidth (ECC off) | 320 GB/sec |
Memory size (GDDR5) | 8 GB |
Number of cores | 60 |
Flux has Hadoop as a technology preview
Flux has a Hadoop environment that offers 16TB of HDFS storage,
soon expanding to move than 100TB.
The Hadoop environment is based on Apache Hadoop version 1.1.2.
The Hadoop environment is a technology preview and has no charge
associated with it. For more information on using Hadoop on Flux email
The Hadoop environment is based on Apache Hadoop version 1.1.2.
Hive | v0.9.0 |
HBase | v0.94.7 |
Sqoop | v1.4.3 |
Pig | v0.11.1 |
R + rhdfs + rmr2 | v3.0.1 |
hpc-support@umich.edu
.
Next Year
The initial hardware will be replaced
Flux has a three-year hardware replacement cycle; we are in the
process of replacing the initial 2,000 cores.
The new cores are likely to be in Intel's 10-core Xeon CPUs, resulting in 20 cores per node.
We are planning on keeping the 4GB RAM per core ratio. The memory usage over the last three years have a profile that supports this direction.
The new cores are likely to be in Intel's 10-core Xeon CPUs, resulting in 20 cores per node.
We are planning on keeping the 4GB RAM per core ratio. The memory usage over the last three years have a profile that supports this direction.
Flux may offer a option without software
ARC is hoping to have a Flux product offering that does not include
the availability, and thus cost, of most commercial software.
The "software-free" version of Flux will include
The "software-free" version of Flux will include
- the Intel compilers
- the Allinea debuggers and code profilers
- MathWorks MATLAB®
- other no- or low-cost software
A clearer software usage policy will be published
With changes in how software on Flux is presented will come
guidance on appropriate use of the Flux software library.
In approximate terms, the Flux software library is
In approximate terms, the Flux software library is
- licensed for academic research and education by faculty, students, and staff of the University of Michigan.
- not licensed for commercial work, work that yields proprietary or restricted results, or for people who are not at the University of Michigan.
Flux on Demand may be available
ARC continues to work on developing a Flux-on-Demand service.
We hope to have some variant of this available sometime in the Winter semester.
We hope to have some variant of this available sometime in the Winter semester.
A High-Throughput Computing service will be piloted
ARC, CAEN, and ITS are working on a High-Throughput Computing
service based on HTCondor (http://research.cs.wisc.edu/htcondor/).
This will allow for large quantities (1000s) of serial jobs to be run on either Windows or Linux.
ARC does not expect there to be any charge to the researchers for this.
This will allow for large quantities (1000s) of serial jobs to be run on either Windows or Linux.
ARC does not expect there to be any charge to the researchers for this.
Advanced Research Computing at the University of Michigan
Wednesday, November 6, 2013
Nyx/Flux Winter 2013-14 Outage
Nyx, Flux, and their storage systems (/home, /home2, /nobackup, and /scratch) will be unavailable starting at 6am Thursday January 2nd, returning to service on Saturday, January 4th.
During this time, CAEN will be making the following updates:
* The OS and system software will be upgraded. These should be minor updates provided by RedHat
* Scheduling software updates, including the resource manager (PBS/Torque), job scheduler (Moab), and associated software
* PBS-generated mails related to job data will now be from hpc-support@umich.edu, rather than the current cac-support@umich.edu
* Transitioning some compute nodes to a more reliable machine room
* Software updates to the high speed storage systems (/nobackup and /scratch)
* The College of Engineering AFS cell being retired (/afs/engin.umich.edu). Jobs using the Modules system should have no issue, but any PBS scripts which directly reference /afs/engin.umich.edu/ will be impacted.
* Migrating /home from a retiring filesystem to Value Storage
During this time, CAEN will be making the following updates:
* The OS and system software will be upgraded. These should be minor updates provided by RedHat
* Scheduling software updates, including the resource manager (PBS/Torque), job scheduler (Moab), and associated software
* PBS-generated mails related to job data will now be from hpc-support@umich.edu, rather than the current cac-support@umich.edu
* Transitioning some compute nodes to a more reliable machine room
* Software updates to the high speed storage systems (/nobackup and /scratch)
* The College of Engineering AFS cell being retired (/afs/engin.umich.edu). Jobs using the Modules system should have no issue, but any PBS scripts which directly reference /afs/engin.umich.edu/ will be impacted.
* Migrating /home from a retiring filesystem to Value Storage
We will post status updates on our Twitter feed (https://twitter.com/UMCoECAC), which can also be found on the CAEN HPC website at http://cac.engin.umich.edu .
Subscribe to:
Posts (Atom)