Google Releases Gemma 4: Frontier AI That Runs on a Single GPU
Google DeepMind launched Gemma 4 on April 2, 2026, a new family of open-weight AI models marking a significant step forward in accessible artificial intelligence.
Gemma 4 comes in four sizes: E2B and E4B optimized for mobile and IoT devices, plus 26B MoE and 31B Dense for more demanding tasks. The largest model ranks among the top open models globally and can run on a single GPU.
A key announcement is that Gemma 4 launches under the Apache 2.0 license, providing far greater commercial freedom than previous Gemma versions. The models are built on the same research as Gemini 3, and natively support function-calling, structured JSON output, and system instructions for agentic use.
The smallest models (E2B and E4B) are particularly designed for offline use with near-zero latency, enabling powerful AI directly on edge devices without internet connectivity.
For CIOs and developers, this opens a new opportunity to run frontier-level AI internally, without dependence on cloud infrastructure. With the Apache 2.0 license, the models can be freely used in commercial products.
📬 Likte du denne?
AI-nyheter for ledere. Kuratert av en CIO som bygger det selv. Daglig i innboksen.