Nine months after the launch of Gemini 1.5, Google announced the next major update to the Large Language Model (LLM), Gemini 2.0. The first model from the family, Gemini 2.0 Flash, can be selected as an experimental model in Google AI Studio and Vertex AI.
Gemini 2.0 Flash has "enhanced performance at similarly fast response times" and outperforms 1.5 Flash with "twice the speed." In addition to multimodal input like images, text, video, and audio, the new LLM supports pictures mixed with text and text-to-speech multilingual audio.
2.0 Flash can also natively access Google Search and supports third-party code execution and pre-defined functions. Google is also releasing its Multimodal Live API to developers. A chat-optimized version of 2.0 Flash will be available on desktop and mobile browsers. Google says a version will be available for the Gemini mobile app soon.
Google's Project Astra research prototype has also been updated with Gemini 2.0 and now has better dialogue, reasoning, and native support for tools like Google Search, Lens, and Maps. It has up to 10 minutes of in-session memory.
Project Mariner, another research prototype built on 2.0, can understand complex instructions and access information from a browser screen, including "pixels and web elements like text, code, images and forms, and then uses that information via an experimental Chrome extension to complete tasks for you."
The third prototype, an experimental AI code assistant, Jules, can be integrated directly into GitHub workflows. It has reasoning and logic capabilities to tackle code challenges and develop a plan to solve them under developer supervision.
Google says it has also built AI agents "using Gemini 2.0 that can help you navigate the virtual world of video games. It can reason about the game based solely on the action on the screen, and offer up suggestions for what to do next in real time conversation."
Are you a techie who knows how to write? Then join our Team! Wanted:
- News Writer (Romania based)
Details here