Argonne Leadership Computing Facility

AI-Testbed

Explore the Testbed

About the ALCF AI-Testbed

Argonne AI-Testbed at Argonne Leadership Computing Facility (ALCF) provides an infrastructure of next-generation AI-accelerator machines. It aims to help evaluate usability and performance of machine learning based high-performance computing applications running on these accelerators. The goal is to better understand how to integrate with existing and upcoming supercomputers at the facility to accelerate science insights.

Activities:

  • Maintain a range of hardware and software environments for AI-accelerators
  • Provide a platform to benchmark applications, programming models and ML frameworks
  • Support science application teams to port and evaluate their applications
  • Co-ordinate with vendors with product developments
Currently, the AI-Testbed consists of Cerebras, SambaNova and GraphCore systems. We are working closely with other vendors to obtain more testbed systems.

Testbed Applications

Here are some projects that are currently running on the AI-Testbed.

Deep Learning for Cancer

Predicting cancer type and drug response using histopathology images from the National Cancer Institute’s Patient-Derived Models Repository. Image: Rick Stevens, Argonne National Laboratory

SARS-CoV-2 Spike Dynamics

AI-Driven Multiscale Simulations Illuminate Mechanisms of SARS-CoV-2 Spike Dynamics

Massive Star Simulations

A team of researchers from the University of California, Santa Barbara used ALCF-generated simulations to study the global structure of the gaseous outer layers of massive stars. Credit: Lars Bildsten, Kavli Institute for Theoretical Physics

Systems

Currently, the AI-Testbed has Cerebras, SambaNova and GraphCore systems set up and running. We are working closely with vendors including Groq, among others.

Cerebras (CS-1)

CS-1 is a wafer-scale, deep learning accelerator. Processing, memory, and communication in CS-1 reside in the Cerebras Wafer-Scale Engine (WSE).

Graphcore

Colossus GC2 Intelligent Processing Unit (IPU) was designed to provide state-of-the-art performance for training and inference workloads.

SambaNova

SambaNova systems aims to develop and accelerate AI applications at scale with a Reconfigurable Dataflow ArchitectureTM (RDA).

Groq (Available in 2021)

Groq tensor streaming processor (TSP) provides a processing core and memory building block to achieve 250 TFlops in FP16 and 1 PetaOp/s in INT8 performance.

To Be Announced





To Be Announced





Access and Support

Accessing the ALCF AI-Testbed

In order to access the ALCF AI-Testbed, you need a CELS LDAP account. To create a CELS LDAP account, you can request one at https://accounts.cels.anl.gov.

You will first need to login to one of the MCS or CELS machines and then ssh to the appropriate system.


Support

Please contact ai@alcf.anl.gov for questions or feedback.