Nvidia packed about three years’ worth of news into its GPU Technology Conference today.
Flamboyant CEO Jensen Huang’s 1 hour, 39-minute keynote covered a lot of ground, but the unifying themes to the majority of the two dozen announcements were GPU-centered and Nvidia’s platform approach to everything it builds.
Most people know Nvidia as the world’s largest manufacturer of a graphics processing unit, or GPU. The GPU is a chip that was first used to accelerate graphics in gaming systems. Since then, the company has steadily found new use cases for the GPU, including autonomous vehicles, artificial intelligence (AI), 3D video rendering, genomics, digital twins and many others.
The company has advanced so far from mere chip design and manufacturing that Huang summarized his company’s Omniverse development platform as “the new engine for the world’s AI infrastructure.”
Unlike all other silicon manufacturers, Nvidia delivers its product as more than just a chip. It takes a platform approach and designs complete, optimized solutions that are packaged as reference architectures for its partners to then build in volume.
This 2022 GTC keynote had many examples of this approach.
NVIDIA Hopper H100 Systems ‘transform’ AI
As noted earlier, the core of all Nvidia solutions is the GPU and at GTC22, the company announced its new Hopper H100 chip, which uses a new architecture designed to be the engine for massively scalable AI infrastructure. The silicon features a whopping 80B transistors and includes a new engine, specifically designed for training and inferencing of transformer engines. For those with only a cursory knowledge of AI, a transformer is a neural network that literally transforms AI based on a concept called “attention.”
Attention is where every element in a piece of data tries to figure out how much it understands or needs to know about other parts of the data. Traditional neural networks look at neighboring data, whereas transformers see the entire body of information. Transformers are used extensively in natural language processing (NLP), since completing a sentence and understanding what the next word in the sentence should be – or what a pronoun would mean – is all about understanding what other words are used and what sentence structure the model might need to learn.
The chip alone provides massive processing capability, but multiple GPUs can be linked together using Nvidia’s NVLink interconnect, effectively creating one big GPU resulting in 4.9 Tbps of external bandwidth.
On a related note, Huang also announced an expansion of NVLink from an internal interconnect technology to a full external switch. Previously, NVLink was used to connect GPUs inside a computing system. The new NVLink switch enables up to 256 GPUs to act as a single chip. The capability to go outside the system results in compute performance of 192 Teraflops. While this might seem like a crazy amount of performance, recommender systems, natural-language processing and other AI use cases are ingesting massive amounts of data, and these data sets are only getting larger all the time.
Continuing with the platform theme, Nvidia also announced new DGX H100-based systems, SuperPODs (multi-node systems) and a 576-node supercomputer. This is a turnkey system with all the software and hardware required for near- plug-and-play AI tasks. Like all its systems, this is built as a reference architecture with production systems available from a wide range of system providers, including Atos, Cisco, Dell, HPE, Lenovo and other partners.
AI Enterprise 2.0 is now full stack
There may be no better example of the platform approach than how Nvidia has enabled enterprise AI. The company approaches this segment with a multi-layer model. The bottom layer is the AI infrastructure, which includes different systems such as DGX, HGX, EGX and others built on NVIDIA’s wide range of GPUs and DPUs. Above that, Nvidia provides all the necessary software and operating systems to let developers work with the hardware. This includes CUDA, TAO, RAPIDS, Triton Inference Server, TensorFlow and other software.
The top layer is a set of pre-built AI systems to address specific use cases. For example, Maxine is the company’s video AI system, Clara is designed for healthcare, Drive for the auto industry and Isaac is its simulator.
This enables enterprises and software vendors to use these components to deliver innovative new capabilities. For example, unified communications vendor, Avaya, uses Maxine in its Spaces product for noise removal, virtual backgrounds, and other features in video meetings. Many of the auto manufacturers including Jaguar and Mercedes are using Drive as the platform for autonomous vehicles.
Huang also announced the formalization of the AI platform. When one thinks of other enterprise platforms, such as VMware vSphere and Windows Servers, these have a continuous innovation roadmap and an ecosystem of validated software that runs on them. NVIDIA currently has a program for the underlying hardware with vendors that include Lenovo, Dell and Cisco. The company is complementing this with a software program called Nvidia Accelerate, which currently has more than 100 members, including Adobe and Keysight. This should give customers the confidence that the software has been tested, vetted and optimized for the Nvidia platform.
Omniverse expands to the clouds
Nvidia’s Omniverse is a collaboration and simulation engine that obeys all the laws of physics. Companies can use this to build a virtual version of an object cutting down training time. For example, teaching a robot to walk can be expensive and time-consuming, because one would need to build a number of scenarios such as uphill, downhill, stairs and more. With Omniverse, this can be done virtually, the data uploaded, and the robot then has the capability of walking immediately. Another use case is to build digital twins of something like a factory so building planners can design it to scale before construction begins.
At GTC22, Nvidia announced Omniverse Cloud, which as the name suggests, makes the simulation engine available as a streaming cloud service. Historically, one would need a high-powered system to run Omniverse. Now as a cloud service, it can run on any computing device, even a Chromebook or tablet. This democratizes Omniverse and makes it available to anyone with an Internet connection.
The second announcement is the OVX Computing System, which is a data center-scale system for industrial digital twins. The system starts with eight NVIDIA A40 GPUs and scales up from there. Again, like all of its systems, this is a reference architecture with systems coming from Lenovo, Inspur and Supermicro.
Platform approach has created sustainable differentiation
Many industry watchers have been predicting Nvidia ‘s dominance in GPUs to come to an end as more silicon manufacturers enter the market, which creates competition and pricing pressure. For example, Intel has been aggressively pursuing GPUs for years, but no one has managed to make a dent in Nvidia’s business. The platform approach Nvidia has taken is common in networking, cloud and software but is unique to it in silicon. The advantages were highlighted in Jensen’s keynote and have created long-term differentiation for the company.