Nvidia’s DGX cloud on OCI is now available for general AI training

Visit our on-demand gallery to view sessions from VB Transform 2023. register here
Nvidia today announced its cloud-based broad accessibility artificial intelligence supercomputer service, DGX Cloud. The service will give users access to thousands of virtual Nvidia GPUs on Oracle Cloud Infrastructure (OCI), along with infrastructure in the US and UK.
DGX Cloud announced during Nvidia event GTC Conference in March. It promises to provide businesses with the necessary infrastructure and software to train cutting-edge models in artificial intelligence and other areas that use AI.
Nvidia says the purpose-built infrastructure is designed to meet the AI generation’s need for massive AI supercomputers to train large, complex models like language models.
>> VentureBeat’s constant tracking artificial intelligence insurance <
“Similar to the number of businesses that have deployed DGX SuperPOD on-premises, DGX Cloud leverages the best of compute architecture, with large clusters of dedicated DGX Cloud instances interconnected over an ultra-high-bandwidth, low-latency Nvidia network fabric,” Tony Paikeday, senior director, DGX Platforms at Nvidia, told VentureBeat.
Paikeday says DGX Cloud simplifies complex infrastructure management, delivering a user-friendly “serverless AI” experience. This allows developers to focus on running tests, building prototypes, and achieving viable models faster without the burden of infrastructure.
“Organizations that need to develop generalized AI models before the advent of DGX Cloud will only have data center infrastructure in place as a viable option to tackle these large-scale workloads,” Paikeday told VentureBeat. “With DGX Cloud, any organization can now remotely access their own AI supercomputer to train large complex LLMs and other general AI models from the convenience of a browser, without the need to operate a supercomputer data center.”
>>Don’t miss our special number: The future of the data center: Dealing with ever-greater demands.<
Nvidia claims that the offering allows generalist AI developers to distribute massive workloads in parallel across multiple compute nodes, resulting in a two- to three-fold increase in training speed compared to traditional cloud computing.
The company also claims that DGX Cloud allows businesses to set up their own “AI centers of excellence,” supporting large teams of developers while working on multiple AI projects. These projects can benefit from a pool of supercomputing capabilities that automatically respond to AI workloads as needed.
Alleviate enterprise general AI workloads through DGX Cloud
Based on McKinseyInnovative AI can contribute more than $4 trillion annually to the global economy by transforming proprietary business knowledge into next-generation AI applications.
The exponential growth of innovative AI has forced leading companies in various industries to adopt AI as a business imperative, driving demand for accelerated computing infrastructure. Nvidia says it has optimized the architecture of the DGX Cloud to meet these growing compute needs.
Nvidia’s Paikeday says developers often face challenges in preparing data, building early prototypes, and making efficient use of GPU infrastructure. DGX Cloud, powered by Nvidia Base Command Platform and Nvidia AI Enterprise, aims to solve these problems.
“Through the Nvidia Base Command Platform and Nvidia AI Enterprise, DGX Cloud enables developers to access production-ready models sooner and with less effort, thanks to accelerated speeds. data science Paikeday told VentureBeat.
biotechnology company Amgen are using DGX Cloud to accelerate drug discovery. Nvidia says the company uses DGX Cloud in conjunction with Nvidia BioNeMo large language modeling (LLM) software and Nvidia AI Enterprise software, which includes the Nvidia RAPIDS data science acceleration libraries.
“With Nvidia DGX Cloud and Nvidia BioNeMo, our researchers can focus on deeper biology instead of dealing with AI infrastructure and ML engineering setups,” said Peter Grandsard, executive director of research, biotherapeutic discovery, Center for Accelerating Research by Digital Innovation at Amgen, in a written statement.
A healthy case study
Amgen claims they can now rapidly analyze trillions of antibody sequences through the DGX Cloud, enabling rapid development of fusion proteins. The company reported that the compute and multi-node capabilities of DGX Cloud helped it achieve up to three times faster protein LLM training with BioNeMo and up to 100 times faster post-training analysis with Nvidia RAPIDS than alternative platforms.
Nvidia will offer DGX Cloud instances on a monthly rental basis. Each instance will feature eight powerful Nvidia 80GB Tensor Core GPUs, delivering 640GB of GPU memory per node.
The system uses a high-performance, low-latency fabric that allows workloads to be scaled across interconnected clusters, effectively turning multiple instances into a single large GPU. In addition, DGX Cloud is equipped with high-performance storage, providing a comprehensive solution.
The offering will also include Nvidia AI Enterprise, a layer of software featuring more than 100 end-to-end AI frameworks and pre-trained models. The software aims to facilitate accelerated data science pipelines and accelerate production AI development and deployment.
“The DGX cloud not only provides massive computing resources, but also enables data scientists to work more efficiently and use their resources more efficiently,” said Paikeday. “They can get started right away, launch several concurrent jobs with great visibility, and run multiple general AI programs in parallel, with support from Nvidia’s AI experts who help optimize customer code and workloads.”
VentureBeat’s Mission is to become a digital city square for technical decision-makers to gain knowledge of transformative and transactional enterprise technology. Explore our Briefings.