Artificial intelligence is entering a new era — agentic AI — where teams of specialized agents can help people solve complex problems and automate repetitive tasks.
With custom AI agents, enterprises across industries can manufacture intelligence and achieve unprecedented productivity. These advanced AI agents require a system of multiple generative AI models optimized for agentic AI functions and capabilities. This complexity means that the need for powerful, efficient, enterprise-grade models has never been greater.
To provide a foundation for enterprise agentic AI, NVIDIA today announced the Llama Nemotron family of open large language models (LLMs). Built with Llama, the models can help developers create and deploy AI agents across a range of applications — including customer support, fraud detection, and product supply chain and inventory management optimization.
To be effective, many AI agents need both language skills and the ability to perceive the world and respond with the appropriate action.
With new NVIDIA Cosmos Nemotron vision language models (VLMs) and NVIDIA NIM microservices for video search and summarization, developers can build agents that analyze and respond to images and video from autonomous machines, hospitals, stores and warehouses, as well as sports events, movies and news. For developers seeking to generate physics-aware videos for robotics and autonomous vehicles, NVIDIA today separately announced NVIDIA Cosmos world foundation models.
Built with Llama foundation models — one of the most popular commercially viable open-source model collections, downloaded over 650 million times — NVIDIA Llama Nemotron models provide optimized building blocks for AI agent development. This builds on NVIDIA’s commitment to developing state-of-the-art models, such as Llama 3.1 Nemotron 70B, now available through the NVIDIA API catalog.
Llama Nemotron models are pruned and trained with NVIDIA’s latest techniques and high-quality datasets for enhanced agentic capabilities. They excel at instruction following, chat, function calling, coding and math, while being size-optimized to run on a broad range of NVIDIA accelerated computing resources.
“Agentic AI is the next frontier of AI development, and delivering on this opportunity requires full-stack optimization across a system of LLMs to deliver efficient, accurate AI agents,” said Ahmad Al-Dahle, vice president and head of GenAI at Meta. “Through our collaboration with NVIDIA and our shared commitment to open models, the NVIDIA Llama Nemotron family built on Llama can help enterprises quickly create their own custom AI agents.”
Leading AI agent platform providers including SAP and ServiceNow are expected to be among the first to use the new Llama Nemotron models.
“AI agents that collaborate to solve complex tasks across multiple lines of the business will unlock a whole new level of enterprise productivity beyond today’s generative AI scenarios,” said Philipp Herzig, chief AI officer at SAP. “Through SAP’s Joule, hundreds of millions of enterprise users will interact with these agents to accomplish their goals faster than ever before. NVIDIA’s new open Llama Nemotron model family will foster the development of multiple specialized AI agents to transform business processes.”
“AI agents make it possible for organizations to achieve more with less effort, setting new standards for business transformation,” said Jeremy Barnes, vice president of platform AI at ServiceNow. “The improved performance and accuracy of NVIDIA’s open Llama Nemotron models can help build advanced AI agent services that solve complex problems across functions, in any industry.”
The NVIDIA Llama Nemotron models use NVIDIA NeMo for distilling, pruning and alignment. Using these techniques, the models are small enough to run on a variety of computing platforms while providing high accuracy as well as increased model throughput.
The Llama Nemotron model family will be available as downloadable models and as NVIDIA NIM microservices that can be easily deployed on clouds, data centers, PCs and workstations. They offer enterprises industry-leading performance with reliable, secure and seamless integration into their agentic AI application workflows.
The Llama Nemotron and Cosmos Nemotron model families are coming in Nano, Super and Ultra sizes to provide options for deploying AI agents at every scale.
Enterprises can also customize the models for their specific use cases and domains with NVIDIA NeMo microservices to simplify data curation, accelerate model customization and evaluation, and apply guardrails to keep responses on track.
With NVIDIA NeMo Retriever, developers can also integrate retrieval-augmented generation capabilities to connect models to their enterprise data.
And using NVIDIA Blueprints for agentic AI, enterprises can quickly create their own applications using NVIDIA’s advanced AI tools and end-to-end development expertise. In fact, NVIDIA Cosmos Nemotron, NVIDIA Llama Nemotron and NeMo Retriever supercharge the new NVIDIA Blueprint for video search and summarization, announced separately today.
NeMo, NeMo Retriever and NVIDIA Blueprints are all available with the NVIDIA AI Enterprise software platform.
Llama Nemotron and Cosmos Nemotron models will be available soon as hosted application programming interfaces and for download on build.nvidia.com and Hugging Face. Access for development, testing and research is free for members of the NVIDIA Developer Program.
Enterprises can run Llama Nemotron and Cosmos Nemotron NIM microservices in production with the NVIDIA AI Enterprise software platform on accelerated data center and cloud infrastructure.
Sign up to get notified about Llama Nemotron and Cosmos Nemotron models, and join NVIDIA at CES.
See notice regarding software product information.