Webinar: Neocortex CS-2 Overview

Presented on Tuesday, March 29 2022, 2:20 – 3:30 pm (ET), by Dr. Natalia Vassilieva from Cerebras.

This webinar gives an overview of the recent Neocortex System upgrade, an NSF-funded AI supercomputer deployed at PSC, now featuring two Cerebras CS-2 systems. in order to help researchers better understand the benefits of the new servers and changes to the system.

The webinar recording can be found on the Neocortex portal.

For more information about Neocortex, explore the Neocortex project page. For questions about this webinar, please email neocortex@psc.edu.

 

Webinar Table of Contents
Welcome
Code of conduct
CS-2 overview
Cerebras Wafer Scale-Engine 2
Cerebras CS-1 and CS-2: Cluster-scale performance in a single system
The Cerebras software platform
Execution mode on CS-1 for DNNs
Execution modes on CS-2 for DNNs
Comparing execution modes
CS-2 advantages for pipelined
Can fit larget models. How much larger?
Can fit larget inputs. How much larger?
Faster training. How much faster?
CS-2 and weight streaming advantages
Wafer memory management
No layer partitioning
Summary
Q&A Session

 

Q&A

How do we request additional disk storage on the new CS2 machine? and identify if the system is a CS1 or CS2?

Neocortex is now CS2 only. The storage is on the SDFlex front-end, as before.

Does CS-2 enable significantly less allocation wait times (due to the availability of more cores etc)?

If the same-sized problem can be decomposed onto more processing elements, it will run faster. However, the larger size may allow for larger models to be run that were not able to be run before. We don’t know how the use will change to know the timing changes with any level of certainty.

So the ability to stream weights is due to new software and more cores, not fundamental changes to the hardware?

Yes, that is right, the software stack handles how the model is mapped and the availability of more cores and bandwidth allows us to do this with bigger models.

Are the weights/gradients synchronized in the multi-replica setting per batch (i.e all-reduce)?

Yes, that is right.

Not sure if I understand correctly, but for multi-replica, you need to aggregate gradients and update weights iteratively, correct? If so, how often?

In a single replica setting, updates happen every step (one passes through a batch). In multi-replica, one batch is distributed across all the replicas, and each replica process samples sequentially.

This question has been answered live at 43:03.

How many weights does the U-Net have here?

Around 31 million weights.

We mentioned 3D volumes here, are we going to support more on operation on these data types? Video, dynamic images, etc.

This question has been answered live: [45:01]

Is the weight streaming mode available with PyTorch code? Can I just import my model, and ask the CS-2 to run in weight-streaming mode?

This question has been answered live: [45:51]

Why proportional to batch size? You are streaming the data in also, right?

This question has been answered live: [47:25]

How fast can weights stream onto the cs2 chip from the external memory?

This question has been answered live: [48:25]

Is there a demo codebase and documentation we can get to utilize CS-2s?

This question has been answered live: [49:50]

Is there a way to request to certain types of models (computer vision-related) to be included in the releases? I have a specific model in mind that could benefit from weight streaming

This question has been answered live: [51:28]

Are you considering interfacing CS-2 to a quantum computer for hybrid quantum-classical processing for algorithms like Variational Quantum Eigensolver to find the ground energy state of small molecules?

This question has been answered live: [52:07]

If the model works in pipelined mode, is it likely to work with weight streaming? So I can check if all the operations are supported by the CS compiler

About the instructor

Dr. Vassilieva is the Director of Product, Machine Learning at Cerebras Systems, an innovative computer systems company dedicated to accelerating deep learning. Natalia’s main interests and expertise are in machine learning, artificial intelligence, analytics, and application-driven software-hardware optimization and co-design. Prior to Cerebras, Dr. Vassilieva was affiliated to Hewlett Packard Labs where she led the Software and AI group from 2015 till 2019 and served as the head of HP Labs Russia from 2011 to 2015. From 2012 to 2015, Natalia also served as a part-time Associate Professor at St. Petersburg State University and a part-time lecturer at the Computer Science Center, St. Petersburg, Russia. Before joining HP Labs in 2007, Natalia worked as a Software Engineer for different IT companies in Russia from 1999 till 2007. Natalia holds a Ph.D. in Computer Science from St. Petersburg State University.

Parkinson’s Research, Evolution of Vocalization, AI Training Tool, and National AI Collaboration Underlie Four HPCwire Awards to PSC

High Performance Computing Achievements Recognized by Peers, Editors of Leading Trade Press Magazine at SC24 Conference in Atlanta

ByteBoost Workshop: Accelerating HPC Skills and Advancing Computational Research

Student Projects Tackle Challenges in Drug Discovery, Congressional Policy, Coordinating Heavy Air Traffic, and More

Dana O’Connor – MCS Senior Rookie Awardee

Dana O’Connor, Machine Learning Research Scientist, talks about her recent Senior Rookie award and her work at PSC.

PSC’s Bridges-2 Joins Neocortex Among Elite Artificial Intelligence Computers Allocated through National NAIRR Pilot Project

The Pittsburgh Supercomputing Center’s Bridges-2 supercomputer is now available to scientists through the National AI Research Resource (NAIRR) Pilot Project.

PSC’s Neocortex Among Elite Artificial Intelligence Computers Selected for National AI Research Resource Pilot Project

Initial Goal of NAIRR Pilot Project, Also Supported by Allocations Software Developed by PSC and ACCESS Partners, Will Be to Explore Trustworthy AI

Training

Contact us

Email us at neocortex@psc.edu

This material is based upon work supported by the National Science Foundation under Grant Number 2005597. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.