Nvidia Unveils New Open AI Model That Rivals The World’s Smartest Systems

By 813 Staff

Nvidia Unveils New Open AI Model That Rivals The World’s Smartest Systems

The latest development in AI and tech shows Nvidia Unveils New Open AI Model That Rivals The World’s Smartest Systems, according to NVIDIA (@nvidia) (in the last 24 hours).

Source: https://x.com/nvidia/status/2062522316672667770

The urgency this morning comes from a single post by NVIDIA (@nvidia), issued at 9:14 AM Pacific yesterday, that has already prompted frantic refreshes in AI labs from Palo Alto to Shenzhen. The company quietly dropped a link to “Nemotron 3 Ultra,” calling it a “frontier smart open model.” This is not a research paper drop. Internal documents show that NVIDIA has been quietly benchmarking Nemotron 3 Ultra against both Llama 4 and GPT-5-class models since early May, and engineers close to the project say the architecture leverages a new sparse mixture-of-experts design that allows it to route inference through only 15% of its total parameters for most queries. The result, according to early internal benchmarks, is a model that matches or exceeds GPT-5 on coding and reasoning tasks while running on significantly less hardware.

The rollout, however, has been anything but smooth within NVIDIA’s own partner network. Multiple sources report that access to the model weights was initially limited to a handful of cloud providers, but a leak on Hugging Face last night suggests builds were already circulating in private Discord servers. NVIDIA has not confirmed any leak, but its silence is notable given the company’s usual speed in addressing security concerns. The company’s official post promises “frontier smart open” capability, but what remains uncertain is the licensing terms. Historically, “open” from NVIDIA has meant source-available with commercial restrictions, and developers are waiting for a full license file to determine if this is truly usable for production fine-tuning or just another taster.

Why this matters: Nemotron 3 Ultra could reset the cost equation for enterprise AI deployments. If the benchmark claims hold under independent testing, companies currently renting clusters of H100s or B200s could cut inference costs by roughly 40% without sacrificing quality. The timing is critical—OpenAI and Anthropic are both expected to ship new models within weeks, and Google DeepMind is rumored to be preparing a Gemini update. NVIDIA, by releasing this now, is staking a claim that the open-weight ecosystem can compete on frontier performance, not just on commodity tasks.

What happens next: expect an official developer blog and model card within 72 hours, along with a curated set of quantization recipes for edge deployment. The larger question—whether Nemotron 3 Ultra will be adopted as widely as Llama—depends entirely on that license. Full verification will take another week.

Source: https://x.com/nvidia/status/2062522316672667770

Related Stories

More Technology →