Unlocking the power of exascale supercomputers
Apr. 27, 2023.
4 min. read Interactions
Exascale supercomputers to transform deep neural networks and online data analysis
Leading research organizations and computer manufacturers in the U.S. are collaborating on construction of some of the world’s fastest supercomputers. These exascale systems can perform more than a billion billion (a quintillion or 1018) operations per second — about the number of neurons in ten million human brains.
Exascale is about 1,000 times faster and more powerful than the fastest supercomputers today, which solve problems at the lower petascale (more than one quadrillion or 1015 operations per second). The new exascale machines will better enable scientists and engineers to answer difficult questions about the universe, advanced healthcare, national security and more, according to the U.S. Department of Energy’s (DOE) Exascale Computing Project (ECP).
Supercomputer uses in deep learning
Meanwhile, the applications and software that will run on supercomputers are also being developed by ECP developers, which recently published a paper (open-access) highlighting their progress in using supercomputers in deep learning.
“The environment will really allow individual researchers to scale up their use of DOE supercomputers on deep learning in a way that’s never been done before,” said Rick Stevens, Argonne associate laboratory director for Computing, Environment and Life Sciences.
DOE’s Argonne National Laboratory, future home to the Aurora exascale system, is a key partner in the ECP. Its researchers are involved in developing applications and co-designing the software needed to enable applications to run efficiently.
Simulating “virtual universes” with Exasky
One exciting application is simulation of “virtual universes” on demand and at high fidelities to investigate how the universe evolved from its early beginnings. Example: an ECP project known as ExaSky, using cosmological simulation codes.
Researchers are also adding capabilities within their codes that didn’t exist before. “We’re able to include atomic physics, gas dynamics and astrophysical effects in our simulations, making them significantly more realistic,” said Salman Habib, director of Argonne’s Computational Science division.
Online data analysis and reduction
Researchers are also co-designing the software needed to efficiently manage the data they create. Today, HPC applications already output huge amounts of data, far too much to efficiently store and analyze in its raw form. So data needs to be reduced or compressed.
One efficient solution to this is to analyze data at the same time simulations are running, a process known as online data analysis or in situ analysis.
An ECP center known as the Co-Design Center for Online Data Analysis and Reduction (CODAR) is developing both online data analysis methods, as well as data reduction and compression techniques for exascale applications. CODAR works closely with a variety of application teams to develop data compression methods, which store the same information but use less space, and reduction methods, which remove data that is not relevant.
Among the solutions the CODAR team has developed is Cheetah, a system that enables researchers to compare their co-design approaches. Another is Z-checker, a system that lets users evaluate the quality of a compression method from multiple perspectives.
Deep learning and precision medicine for cancer treatment
Exascale computing also has important applications in healthcare, and the DOE, National Cancer Institute (NCI) and the National Institutes of Health (NIH) are taking advantage of it to understand cancer and the key drivers impacting outcomes. The Exascale Deep Learning Enabled Precision Medicine for Cancer project is developing a framework called CANDLE (CANcer Distributed Learning Environment) to address key research challenges in cancer and other critical healthcare areas.
CANDLE uses neural networks to find patterns in large datasets. CANDLE is being developed for three pilot projects geared toward understanding key protein interactions, predicting drug response and automating the extraction of patient information to inform treatment strategies.
Scaling up deep neural networks
Unlocking these problems is at different scale — molecular, patient and population levels — but all are supported by the same scalable deep learning environment in CANDLE. The CANDLE software suite includes a collection of deep neural networks that capture and represent the three problems, a library of code adapted for exascale-level computing and a component that orchestrates how work will be distributed across the computing system.
“The environment will allow individual researchers to scale up their use of DOE supercomputers on deep learning in a way that’s never been done before,” said Rick Stevens, Argonne associate laboratory director for Computing, Environment and Life Sciences.
Applications such as these are just the tipping point. Once these systems come online, the potential for new capabilities will be endless.
Citations (open-access): “Exascale applications: skin in the game,” in Philosophical Transactions of the Royal Society A. and Wozniak, Justin M., et al. 2018 and “CANDLE/Supervisor: A Workflow Framework for Machine Learning Applied to Cancer Research.” BMC Bioinformatics 19 (18): 491. https://doi.org/10.1186/s12859-018-2508-4.
Organizations: The U.S. Department of Energy’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. Laboratory partners involved in ExaSky include Argonne, Los Alamos and Lawrence Berkeley National Laboratories. Collaborators working on CANDLE include Argonne, Lawrence Livermore, Los Alamos and Oak Ridge National Laboratories, NCI and the NIH.