Sunday, August 2, 2015

XSEDE15 Updates

We recently returned from the XSEDE 15 representing Michigan and learning about the new resources and features coming online at XSEDE.  What follows are our notes;  there will be a live stream webinar August 6th 10:30am for one hour.  If you have questions please attend:

Webinar: ARC-TS XSEDE[15] Faculty and Student Update
Location: http://univofmichigan.adobeconnect.com/flux/
Time: August 6th 10:30am-11:30am
Login:  (Select Guest, use uniquename)

Champions Program Update

Michigan currently participates in the Campus Champions program via the staff at ARC-TS.  There are two newer programs that faculty and students might take interest in:

Domain Champions

Domain Champions are XSEDE participants like Campus Champions but sorted by field.  These Champions are available nationally to help researchers in their fields even if they do not use XSEDE resources:

Domain Champion Institution
Data Analysis Rob Kooper University of Illinois
Finance Mao Ye University of Illinois
Molecular Dynamics Tom Cheatham University of Utah
Genomics Brian Couger Oklahoma State University
Digital Humanities Virginia Kuhn University of Southern California
Digital Humanities Michael Simeone Arizona State University
Chemistry and Material Science Sudhakar Pamidighantam Indiana University

Student Champion

The Student Champions program is a way for graduate students (preferred but not required) to get more plugged into supporting researchers in research computing.  Michigan does not currently have any student champions.  If you are interested contact ARC-TS at hpc-support@umich.edu.

New Clusters and Clouds

Many of the new XSEDE resources coming online or already available are adding virtualization capability. This ability is sometimes called cloud but can have subtle differences depending what resources you are using.  If you have questions about using any of the XSEDE resources contact ARC-TS at hpc-support@umich.edu.

NSF and XSEDE have recognized that data plays a much larger role than in the past.  Many of the resources have added persistent storage options (file space that isn't purged) as well as database hosting and other resources normally not found on HPC clusters.

Wrangler 

Wrangler is a new data focused computer and is in production.  Notable features are:
  • iRODS Service Available and persistent storage options
  • Can host long running reservations for databases and other services if needed.
  • 600TB of Flash storage directly attached. This storage can change its identity to provide different service types (GPFS, Object, HDFS, etc.).  Sustains over 4.5TB/minute terasort benchmark.

Comet

Comet is a very large traditional HPC system recently in production.  It provides over 2 petaflops of compute mostly in the form of 47,000+ cpu cores.  Notable features are:
  • Host Virtual Clusters, these are customized cluster images when researchers need to make modifications that are not possible in the traditional batch hosting environment. 
  • 36 nodes with 2x Nvidia k80 GPUs (4 total GPU dies / node)
  • SSD in each local node for fast local IO.
  • 4 nodes with 1.5TB

Bridges

Bridges is a large cluster that will support more interactive work, virtual machines, and database hosting along with traditional batch HPC processing.  Bridges is not yet in production, some notable features are:
  • Nodes with 128GB, 3TB, and 12TB of RAM
  • Reservations for long running database, web server and other services
  • Planned support for Docker containers

Jetstream

Jetstream is a cloud platform for science.  It is OpenStack based and will give researchers great control over their exact computing environment.  Jetstream is not yet in production, notable features are:
  • Libraries of VM's will be created and hosted in Atmosphere, researchers will be able to contribute their own images, or use other images already configured for their needs. 
  • Split across two national sites geographically distant

Chameleon

Chameleon is an experimental environment for large-scale cloud research. Chameleon will allow researchers to not only reconfigure the images as virtual machines but as bare metal.  Chameleon is now in production, some notable features are:
  • Geographically separated OpenStack private cloud
  • Not allocated by XSEDE but allocated in a similar way

CloudLab

CloudLab is a unique environment where researchers can deploy their own cloud to do research about clouds or on clouds.  It is in production, some notable features are:
  • Able to prototype entire cloud stacks under researcher control, or bare metal
  • Geographically distributed across three sites
  • Support multiple network types (ethernet, infiniband)
  • Supports multiple CPU types (Intel/X86, ARM64)

XSEDE 2.0

XSEDE was a 5 year proposal we are wrapping up year 4.  The XSEDE proposal doesn't actually provide any of the compute resources these are their own awards and are allocated only by the XSEDE process.  A new solicitation was extended for another 5 years and a response is currently under review by NSF.  The next generation of XSEDE aims to be even more inclusive and focus more on data intensive computing.