![]()
Artificial intelligence leader Nvidia Corp. Monday announced the Nemotron-3 family of models, data and tools, and the release is further evidence of the company’s commitment to the open ecosystem, focusing on delivering highly efficient, accurate and transparent models essential for building sophisticated agentic AI applications.
Nvidia executives, including Chief Executive Jensen Huang, have talked about the importance open source plays in democratizing access to AI models, tools and software to create that “rising tide,” and bringing AI to everyone. The announcement underscores Nvidia’s belief that open source is the foundation of AI innovation, driving global collaboration and lowering the barrier to entry for diverse developers.
Addressing the new challenges in enterprise AI
As large language models achieve reasoning accuracy suitable for enterprise applications, on an analyst prebrief Nvidia highlighted three critical challenges facing businesses today:
- The need for a system of models: There has not been and will not ever be a single model to rule them all and organizations need a choice of models to build performant AI applications. What’s required is a system of models that work together – different sizes, modalities and orchestrators to deliver a multi-model approach.
- Specialization for the “last mile”: AI applications often “hit a ceiling” and must be specialized for specific domains such as healthcare, financial services or cybersecurity. This requires training models with large volumes of proprietary and expert-encoded knowledge.
- The cost of “long thinking”: More intelligent answers require extended reasoning, self-reflection and deeper deliberation — a process Nvidia calls “long thinking” or test-time compute. This significantly increases token usage and compute cost, demanding more token efficient architectures and inference strategies.
Nemotron-3: The most efficient open model family
Nvidia’s answer to the above challenges is the Nemotron-3 family, characterized by its focus on being open, accurate, and efficient. The new models use a hybrid Mamba-Transformer mixture-of-experts or MoE architecture. This design dramatically improves efficiency as it runs several times faster with reduce memory requirements.
The Nemotron-3 family will be rolled out in three sizes, catering to different compute needs and performance requirements:
- Nemotron-3 Nano (available now): A highly efficient and accurate model. Though it’s a 30 billion-parameter model, only 3 billion parameters are active at any time, allowing it to fit onto smaller form-factor GPUs, such as the L40S.
- Nemotron-3 Super (Q1 2026): Optimized to fit within two H100 GPUs, it will incorporate Latent MoE for even greater accuracy with the same compute footprint.
- Nemotron-3 Ultra (1H 2026): Designed to offer maximum performance and scale.
Improved performance and context length
Nemotron-3 offers leading accuracy within its class, as evidenced by independent benchmarks from testing firm Artificial Analysis. In one test, Nemotron-3 Nano was shown to be the most open and intelligent model in its tiny, small reasoning class.
Furthermore, the model’s competitive advantage comes from its focus on token efficiency and speed. On the call, Nvidia highlighted Nemotron-3 tokens-to-intelligence rate ratio, which is crucial as the demand for tokens from cooperating agents increases. A significant feature of this family is the 1 million-token context length. This massive context window allows the models to perform dense, long-range reasoning at lower cost, enabling them to process full code bases, long technical specifications and multiday conversations within a single pass.
Reinforcement learning gyms: The key to specialization
A core component of the Nemotron-3 release is the use of NeMo Gym environments and data sets for reinforcement learning, or RL. This provides the exact tools and infrastructure Nvidia used to train Nemotron-3. The company is the first to release open, state-of-the-art, full reinforcement learning environments, alongside the open models, libraries and data to help developers build more accurate and capable, specialized agents.
The RL framework allows developers to pick up the environment and start generating specialized training data in hours.
The process involves:
- Training a base model (starting from the NeMo framework).
- Practicing/simulating in “gym” environments to generate answers or follow instructions.
- Scoring/verifying the answers against a reward system (human or automated).
- Updating/retraining the model with the high-quality, verified data, systematically shifting it toward higher-graded answers.
This systematic loop enables models to get better at choosing actions that earn higher rewards, like a student improving their skills through repeated, guided practice. Nvidia released 12 Gym environments targeting high-impact tasks like competitive coding, math and practical calendar scheduling.
Nvidia’s expanded commitment to open source
The Nemotron release is backed by a substantial commitment across three areas:
Open libraries and research
Nvidia is releasing the actual code used to train Nemotron-3, ensuring full transparency. This includes the Nemotron-3 research paper detailing techniques like synthetic data generation and RL.
Nvidia researchers continue to push the boundaries of AI, with notable research including:
- Nemotron Cascade: A student model that outperformed its teacher (DeepSeek, a 500 billion- to 600 billion-parameter model) in coding, demonstrating that the scaling laws of AI continue to extend.
- RLP (Reinforcement Learning in Pretraining): A technique to train reasoning models to think for themselves earlier in the process.
High-quality data sets
Nvidia is shifting the data narrative from big data to smart and improved data curation and quality. To accomplish this, the company is releasing several new data sets:
- Pre-training data: More than 3 trillion new tokens of premium pre-training data, synthetically generated and filtered for “all signal, no noise” quality, using more than 1 million H100 hours of compute.
- Post-training data (Safe Instruction): A 13 million-sample data set using only permissively licensed model outputs, making it safe for enterprise use.
- RL datasets: 12 new reinforcement learning environment and a corpus of datasets covering 900,000 sample tasks and prompts in math, coding, games, reasoning, and tool use, making Nvidia one of the few open model providers releasing both the RL data and the environments.
- Nemotron-agent safety: This provides 10,800 labeled OpenTelemetry traces from realistic, multistep, tool-using agent workflows to help evaluate and mitigate safety and security risks in agentic systems.
Enterprise blueprints and ecosystem
Nvidia is providing reference blueprints to accelerate adoption, integrating Nemotron-3 models and acceleration libraries:
- IQ Deep Researcher: For building on-premises AI research assistants for multi-step investigations.
- Video search and summarization: Turning hours of footage into seconds of insight.
- Enterprise RAG: The most optimized, enterprise-ready retrieval-augment generation blueprint, accelerating every step of the retrieval pipeline.
The Nemotron ecosystem is broad, with day-zero support for Nemotron-3 on platforms such as Amazon Bedrock. Key partners such as CrowdStrike Holdings Inc. and ServiceNow Inc. are actively using Nemotron data and tools, with ServiceNow noting that 15% of the pretraining data for their Apriel 1.6 Thinker model came from an Nvidia Nemotron data set.
The industry is winding down the hype phase of AI and we should start to see more production use cases. The Nemotron 3 family is well-suited for this era as it provides a performant and efficient open-source foundation for the development of the next generation of Agentic AI, reinforcing Nvidia’s deep commitment to democratizing AI innovation.