OpenAI has officially launched its first open-weight models since GPT‑2 with the release of gpt‑oss‑120b and gpt‑oss‑20b, delivering high-end reasoning and tool-use AI that developers can run locally under the Apache 2.0 license. These models challenge the proprietary status quo by offering performance on par with OpenAI proprietary models like o4‑mini and o3‑mini—while empowering users to customize, run offline, and deploy on private infrastructure.

GPT-OSS High Performance, Low Costs, Full Control

GPT‑OSS

The gpt‑oss‑120b, a 117 billion‑parameter Transformer using a mixture‑of‑experts (MoE) architecture, delivers near‑parity with OpenAI’s o4‑mini on core reasoning benchmarks like Codeforces, MMLU, TauBench agentic tasks, and HealthBench—while running efficiently on a single 80 GB GPU.

Meanwhile, gpt‑oss‑20b, at 21 billion parameters, matches or outperforms o3‑mini and runs on consumer hardware with just 16GB of memory, making it usable on laptops or desktops.

Both models support chain‑of‑thought (CoT) reasoning, few‑shot function calling, structured output, and agentic tool use—letting them browse the web, reason mathematically, call APIs, or execute Python code as part of intelligent workflows.

Get Smarter Every Morning with AI BriefNow!

Join 5,000+ readers getting the latest AI, tech, and innovation news in under 5 minutes — every morning.

GPT-OSS Built for Real‑World Use & Safety

GPT‑OSS

These models are not experimental toys—they’re built for efficient deployment on consumer hardware and enterprise systems alike. The 120B model is optimized for a single 80 GB accelerator, while the 20B model is tailor‑made for edge or on‑device inference.

OpenAI has released the weights under Apache 2.0, granting developers full flexibility to fine‑tune, redistribute, or commercialize the models without vendor lock‑in.

Safety was a top priority: OpenAI conducted external evaluations using its Preparedness Framework, including internal adversarial fine‑tuning tests to simulate misuse scenarios. The models performed comparably to proprietary models in safety benchmarks, and the methodology was peer‑reviewed by experts.

OpenAI also launched a Red Teaming Challenge with a $500,000 prize to crowdsource novel safety issues and publish findings along with an evaluation dataset for the developer community.

Battlefield 6 Open Beta

EA and DICE have pulled back the curtain on the Battlefield 6 Open Beta, offering gamers an early taste of the long-awaited next chapter in the franchise. The beta kicks off with limited early access on August 7–8, followed by two full weekends of open public play on August 9–10 and August 14–17. Players can start preloading the beta as early as August 4 at 8 AM PDT / 11 AM EDT across PlayStation 5, Xbox Series X|S, and PC platforms (Steam, Epic Games Store, EA App)

GPT-OSS Architecture & Deployment

Both models leverage mixture‑of‑experts (MoE), activating a subset of parameters per token—gpt‑oss‑120b triggers ~5.1B active params, while gpt‑oss‑20b uses ~3.6B—a design choice that balances capability and efficiency.

They also support 128K token context windows, grouped multi‑query attention with group size 8, alternating dense and local sparse attention, rotary positional embeddings (RoPE), and efficient quantization (MXFP4).

These innovations mean the gpt‑oss models can scale efficiently and remain usable on modest hardware while preserving high reasoning capacity.

Use Cases & Deployment Partners

OpenAI is already working with early partners like Microsoft (Azure AI Foundry and Windows AI Foundry), Databricks, AWS Bedrock, Hugging Face, vLLM, llama.cpp, Ollama, and others to make deployment seamless across cloud, local, and edge environments.

NVIDIA has also optimized the models for RTX AI PCs and GeForce RTX 5090 GPUs, delivering inference speeds up to 256 tokens/second locally.

Developers can run the models via Hugging Face Transformers, utilize the open-sourced tokenizer (o200k_harmony), and integrate with agentic workflows using the Responses API or open tools like openai‑harmony and LangChain.

What’s Next: Scaling Open AI Innovation

With the release of gpt‑oss‑120b and 20b, OpenAI is doubling down on the promise of democratized, transparent AI. By making high-end reasoning models open-weight and deployable, they reduce reliance on closed APIs and bring advanced capabilities to developers, enterprises, governments, and researchers who lack the budget or infrastructure for proprietary systems.

As CEO Sam Altman framed it: it’s a “triumph of technology” designed to get AI into the hands of as many people as possible—while pushing ethical and safety standards forward in the open-source AI ecosystem

Get Smarter Every Morning with AI BriefNow!

Join 5,000+ readers getting the latest AI, tech, and innovation news in under 5 minutes — every morning.