The latest update to Google’s AI lineup is aimed at delivering stronger reasoning while cutting response time and cost for users and developers.
On Wednesday, December 17, Google introduced Gemini 3 Flash, a new model in its Gemini AI family. The model is built to offer advanced reasoning and multimodal capabilities while significantly reducing latency and cost. It is designed for real time use cases such as coding, agent based workflows and complex analysis.
Gemini 3 Flash combines the deep reasoning abilities of larger frontier models with the speed and efficiency of the Flash series. Google said the model delivers competitive benchmark performance while processing tasks faster than earlier versions and at a lower cost per token.
“With Gemini 3, we introduced frontier performance across complex reasoning, multimodal and vision understanding and agentic and vibe coding tasks. Gemini 3 Flash retains this foundation, combining Gemini 3’s Pro-grade reasoning with Flash-level latency, efficiency and cost. It not only enables everyday tasks with improved reasoning, but also is our most impressive model for agentic workflows,” the company said in its blog.
The model supports multimodal inputs, allowing it to reason across text, images, audio and video prompts. Google said this makes it suitable for interactive experiences such as real time video analysis, visual question answering and large scale automated data extraction.
On performance, Gemini 3 Flash shows that speed does not come at the cost of intelligence. Google said the model scored 90.4 percent on the GPQA Diamond benchmark and 33.7 percent on Humanity’s Last Exam without tools. These results place it alongside larger frontier models and ahead of Gemini 2.5 Pro on several benchmarks. In multimodal reasoning, it achieved 81.2 percent on MMMU Pro, matching Gemini 3 Pro.
Efficiency is another key focus. The model adjusts its level of reasoning based on task complexity, using more effort for harder problems while staying lightweight for everyday tasks. Google said it uses around 30 percent fewer tokens on average than Gemini 2.5 Pro for typical workloads.
Speed remains its standout feature. Based on third party benchmarks, Gemini 3 Flash is up to 3 times faster than Gemini 2.5 Pro while delivering higher overall performance. Pricing reflects this, with input tokens priced at 0.50 dollars per million and output tokens at 3 dollars per million.
Gemini 3 Flash is rolling out globally as the default model in the Gemini App and AI Mode in Google Search at no extra cost. Developers and enterprises can access it through the Gemini API via Google AI Studio, Vertex AI, Gemini CLI and Android Studio.
Also read: Viksit Workforce for a Viksit Bharat
Do Follow: The Mainstream formerly known as CIO News LinkedIn Account | The Mainstream formerly known as CIO News Facebook | The Mainstream formerly known as CIO News Youtube | The Mainstream formerly known as CIO News Twitter
About us:
The Mainstream is a premier platform delivering the latest updates and informed perspectives across the technology business and cyber landscape. Built on research-driven, thought leadership and original intellectual property, The Mainstream also curates summits & conferences that convene decision makers to explore how technology reshapes industries and leadership. With a growing presence in India and globally across the Middle East, Africa, ASEAN, the USA, the UK and Australia, The Mainstream carries a vision to bring the latest happenings and insights to 8.2 billion people and to place technology at the centre of conversation for leaders navigating the future.



