Google Launches Gemini 3.1 Flash-Lite — Fastest and Most Affordable Model in the Gemini 3 Series
Google has introduced Gemini 3.1 Flash-Lite, its most cost-efficient AI model to date. Built for high-volume developer workloads at scale, the model is rolling out in preview via the Gemini API in Google AI Studio and for enterprises via Vertex AI.
Lightning Fast, Extremely Affordable
Gemini 3.1 Flash-Lite is priced at just $0.25 per million input tokens and $1.50 per million output tokens. It delivers 2.5x faster Time to First Answer Token and 45% higher output speed compared to 2.5 Flash, according to Artificial Analysis benchmarks.
Despite the aggressive pricing, the model maintains similar or better quality than 2.5 Flash across most tasks.
Built for Scale
The model is optimized for use cases including:
- Large-scale content moderation
- Translation and localization
- User interface generation
- Simulation and synthesis
Availability
Gemini 3.1 Flash-Lite is already deployed with several enterprise customers. It's available in preview via Google AI Studio and Vertex AI, with broader rollout expected soon.
For CIOs evaluating AI infrastructure, Flash-Lite represents a compelling option where latency and cost are critical — particularly for high-frequency real-time processing workloads.
📬 Likte du denne?
AI-nyheter for ledere. Kuratert av en CIO som bygger det selv. Daglig i innboksen.