New Delhi, Dec 16: Google has surprised the global community with the introduction of Gemini, positioning it as it’s most extensive and advanced AI model. With Gemini, Google is entering the competitive landscape alongside industry players like OpenAI, Meta, and Anthropic, all of whom have made significant advancements in AI capabilities.
Google has unveiled three variants of Gemini: Ultra, Pro, and Nano.
The Pro version is set to be integrated into Google Bard, while the Nano model is tailored for use in the Google Pixel 8 Pro. This diverse lineup puts Google in a position to challenge the dominance of OpenAI's ChatGPT. Gemini's notable feature is its multimodal capabilities, allowing it to efficiently process various forms of information, including text, videos, images, and code. The unveiling showcased Google's intent to compete with major rivals, evident in a performance sheet comparing key metrics where Gemini outshone GPT-4, OpenAI's leading model.
Gemini's Three Sizes:
Gemini is Google's family of highly proficient multimodal models. In contrast, OpenAI's Generative Pre-trained Transformer-4 (GPT-4) is a multimodal large language model launched earlier in the year. Gemini 1.0 comes in three sizes: Ultra, designed for complex tasks; Pro, optimized for a broad range of tasks; and Nano, the most efficient for on-device tasks.
Gemini Ultra has achieved state-of-the-art benchmarking, specifically designed for data centers and expected to be available in early 2024. Gemini Pro, similar to GPT-3.5 but optimized for low latency and cost, has already been integrated into Google Bard. Nano, developed for on-device use, has two versions with varying parameters and will be featured in the advanced AI-enabled Google Pixel 8 Pro.
Multimodal Proficiency and Speed:
Gemini stands out for its versatility beyond text, seamlessly integrating diverse data types. This enables natural and engaging interactions, with the model being five times more powerful than GPT-4, thanks to Google's TPUv5 chips. Gemini's faster processing capabilities empower it to handle complex tasks with ease. Notably, it is the first AI model to surpass human experts in the Massive Multitask Language Understanding (MMLU) benchmark, achieving a 90% score across 57 subjects.
Gemini Pro Surpassing GPT-3.5:
Gemini Pro, set to be integrated into Google's chatbot Bard and various Google Apps, outperforms GPT-3.5 in six out of eight benchmarks. As a free AI chatbot accessible to millions, Gemini Pro's multimodal capabilities position it as a potential successor to ChatGPT.
Gemini Ultra Outperforming GPT-4:
Gemini Ultra, the most potent model in the Gemini family, marks a milestone by surpassing GPT-4 in various benchmarks. It achieved state-of-the-art results in 30 out of 32 popular academic benchmarks, showcasing superiority in reasoning, mathematics, and Python code generation compared to GPT-4.