Nvidia Corp.‘s virtual GTC 2023 conference running this week has almost too many announcements to digest, and they go well beyond its signature graphics processing units. But after sifting through many pre-briefings for GTC 2023, here’s are what I think are the top five announcements, plus one honorable mention.
1. Accelerating chip innovation with cuLitho
One of the most impressive announcements wasn’t directly about artificial intelligence, but about accelerating the delivery of new and more powerful chips to market, which is essential if the semiconductor industry is to support continued AI innovation.
Nvidia cuLitho is a library of new algorithms that accelerate the underlying calculations of computational lithography, the process of printing chip designs on silicon. The company says it has demonstrated a 40 times or greater speedup using cuLitho, which will have two key impacts.
First, the increased speed will improve chipmaker productivity and efficiency by enabling Nvidia’s 500 DGX H100 system to do the work of 40,000 central processing unit systems, while supporting datacenter system designs that use one-eighth the space and one-ninth the power. Further, a chip design that used to take two weeks to process can now be processed overnight.
Just as important, Nvidia says, cuLitho will support the production of new chip technologies that require 10 times the computational power of today’s chips, such as for Inverse Curvilinear Masks and ASML High-NA EUV lithography cameras.
2. Democratizing access to custom AI models with DGX Cloud
AI requires a three-step process: acquiring and preparing the data, training a model based on the data, and using the model for inferencing via an application. Although Nvidia DGX have become popular purpose-built systems for this AI workflow, companies increasingly want to do AI training in the cloud.
At the conference, Nvidia announced DGX Cloud, a hybrid cloud training-as-a-service offering based on the DGX platform. By pushing DGX capabilities into the cloud, the offering gives enterprises immediate access to the infrastructure and software needed to train advanced models for generative AI. Nvidia DGX Cloud lets companies access their own “AI supercomputer” from a web browser, with a single view across cloud and on-premises processes.
This lets a scientist point at a dataset, say where the results should reside, and press go. Because its cloud-based, the underlying infrastructure is automatically and efficiently scaled out, enabling organizations to start small and then grow as needed. DGX Cloud can be especially helpful to companies that need to scale out their AI infrastructure but don’t have the internal resources to optimize it. The offering also supports sharing AI processing across multiple data science teams.
DGX Cloud is currently available on Oracle Cloud and via Equinix and is coming to Microsoft Azure and Google Cloud.
As part of the DGX Cloud rollout, Nvidia announced a new set of services to facilitate running three different types of custom generative AI models:
- NeMo for generating and conversing with human language.
- Picasso for generating text to image, text to video and text to 3D-Models.
- BioNeMo for leveraging the language of proteins to accelerate drug discovery.
3. Partnership with Medtronic on Genius AI
Nvidia has been collaborating with Medtronic plc, a world leader in medical devices for several years, producing a range of innovations in the areas of robotic-assisted surgery, surgical navigation and interoperative imaging systems.
At the conference, Nvidia announced a new partnership to design Medtronic’s GI Genius AI Assisted Colonoscopy System based on Nvidia Holoscan, which is a full-stack AI infrastructure for building devices and deploying AI applications directly into clinical settings. The GI Genius AI colonoscopy system will support physicians with real-time AI imaging enhancements via a new suite of algorithms.
A key advantage of Holoscan is that it’s sensor- and modality-agnostic, enabling Medtronic to incorporate dozens of devices they’ve built or acquired over many years into their AI initiatives. By eliminating the need for Medtronic to build and maintain its own infrastructure, Holoscan will enable the company to maintain its rapid pace of innovation.
The societal impact of AI has been debated because it threatens many jobs, but its value in healthcare cannot be overstated. Doctors miss so many things today because they can’t possibly see every anomaly in every data set but an AI can. I’ve talked to doctors at several hospitals that use AI, such as Massachusetts General in Boston, and they have told me AI let the spend less time diagnosing and more time treating as the AI does much of the heavy lifting on looking through the data.
4. Inferencing platforms for democratizing generative AI
AI inferencing is an increasingly important area of technology advancement. We are seeing ChatGPT and other engines being incorporated across a range of solutions, including Microsoft 365. Inferencing is also supporting the use of AI-powered avatars in contact centers, video analysis in manufacturing quality control, and much more.
As part of the event, Nvidia announced the availability of its DGX H100 compute platforms for AI inferencing. According to Nvidia, DGX H100 offers nine times the performance, two times faster networking, and high-speed scalability. Microsoft has announced it will be the first hyperscaler to use it and will provide early access to the DGX H100.
The company also announced Nvidia L4, a single-slot, low-profile accelerator for AI, video and graphics that fits any standard server. According to Nvidia, the L4 can decode, run models and encode video 120 times faster than the best CPU platforms, simultaneously run more than 1040 HD streams, and process graphics four times faster – and generate images two-and-a-half times faster – than the previous T4 model. At the conference, Google Cloud announced early access to L4.
Finally, Nvidia announced the H100 NVL for real-time large language model inferencing in standard form factors. This PCIe GPU fits in any standard server that takes a PCIe card, but there are two GPUs in the double-wide case connected via an NVLink bridge, creating a GPU with 188 gigabytes of memory. NVIDIA says that the H100 NVL provides 12 times the throughput of the NVIDIA HGX A100 and can run a model in parallel to reduce latency and time to completion.
5. BlueField-3 data processing unit
Nvidia announced the availability of its latest data processing unit or DPU, BlueField-3, which has twice as many Arm processor cores and more accelerators than the previous generation DPU, so it can run workloads up to eight times faster. BlueField-3 can offload, accelerate and isolate workloads across the cloud, high-performance computing, enterprise and accelerated-AI use cases.
In a related announcement, Oracle Cloud Infrastructure is selecting BlueField-3 for its networking stack. The combination of OCI and BlueField-3 supports large-scale GPU clustering for dynamic and always-on capacity, with a massive performance and efficiency boost by offloading data center infrastructure tasks from CPUs.
Any server that is doing anything remotely computationally intensive should be using a DPU. There are certain tasks, such as encryption and packet processing, that CPU-based servers are not good at, and this causes organizations have to overspend significantly on servers to have it perform even modestly. DPUs are designed to let severs perform the tasks they should be doing and let the offload engine take much of the intensive tasks.
Honorable mention: Omniverse Cloud on Microsoft Azure
A final but no less interesting announcement from Nvidia is that Omniverse Cloud is coming to Microsoft Azure. It’s a cloud-native, platform-as-a-service offering for connecting, building and operating metaverse applications. I gave this an honorable mention because Omniverse is already widely deployed, with more than 300,000 downloads, according to Nvidia. Its availability in Azure will open the door to many more companies that are looking at leveraging Microsoft for metaverse applications and require the extra levels of enterprise support and training.
As part of this announcement, Microsoft and Nvidia will partner to integrate Omniverse directly into Microsoft 365, which includes Teams, OneDrive and SharePoint, enabling users to access metaverse capabilities directly from business applications.
What makes Nvidia unique isn’t just its GPUs but rather the “full stack” approach it takes to all things AI. It provides complete, turnkey solutions that enable organizations to take advantage of things like generative AI, video analytics and robotics without having to worry about how to put all the parts together. This becomes increasingly important today as OpenAI and ChatGPT have created a new wave of interest in AI.
Earlier this year, Nvidia Chief Executive Jensen Huang (pictured) called ChatGPT the “iPhone moment” for AI, but I disagree with that. Apple Inc.’s iPhone took off because it democratized the smartphone to the point where nontechnical people could use it. I believe ChatGPT created the interest, but it needs to be simplified to make it available to everyone, and that’s what Nvidia does best.