{"id":135637,"date":"2023-12-06T18:38:51","date_gmt":"2023-12-06T18:38:51","guid":{"rendered":"https:\/\/www.techopedia.com"},"modified":"2023-12-14T10:35:54","modified_gmt":"2023-12-14T10:35:54","slug":"google-gemini-goes-live-heres-what-to-expect-from-the-ai","status":"publish","type":"post","link":"https:\/\/www.techopedia.com\/google-gemini-goes-live-heres-what-to-expect","title":{"rendered":"Google Gemini Goes Live: Here\u2019s What to Expect from the AI"},"content":{"rendered":"
Today, Google <\/span>announced<\/span> the launch of its new multimodal AI model called Gemini<\/a>, designed to understand and recognize text, images, video, audio, and code.\u00a0<\/span><\/p>\n “Gemini is the result of large-scale collaborative efforts by teams across Google, including our colleagues at Google Research,” CEO and co-founder of Google DeepMind Demis Hassabis wrote in the official blog post.\u00a0<\/span><\/p>\n “It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across, and combine different types of information including text, code, audio, image, and video,” Hassabis wrote. READ MORE: Google’s Gemini AI is a Serious Threat to ChatGPT \u2013 Here’s Why<\/a><\/strong><\/p>\n There are three confirmed versions of the model: Gemini Ultra, Gemini Pro, and Gemini Nano, which are all unique. Gemini Ultra is the largest, while Gemini Pro is designed to scale across a range of tasks, and Gemini Nano is the most efficient model for on-device tasks (making it ideal for use on mobile devices).<\/span><\/p>\n As of today, Gemini has been added to Google’s <\/span>Bard<\/span><\/a> chatbot, and <\/span>Gemini Nano<\/span><\/a> will be added to the Pixel 8 Pro to power summary and smart-reply capabilities in December.<\/span><\/p>\n Gemini models will eventually be added to other products like Search, Ads, and <\/span>Chrome<\/span><\/a>.\u00a0<\/span><\/p>\n The release comes just a month after OpenAI <\/span>announced<\/span><\/a> the launch of GPT-4 Turbo and its own multimodal model, GPT-4v, which can understand image inputs.\u00a0<\/span><\/p>\n While it’s too early to conclude that Gemini has overtaken OpenAI and GPT-4, it certainly does look that way. In an interview with <\/span>The Ver<\/span>ge<\/span>, Hassibis confirmed that Google had tested Gemini against GPT-4 across 32 benchmarks<\/a>\u00a0and found that Gemini was “substantially ahead” on 30 of those.\u00a0\u00a0<\/span><\/p>\n One of Gemini’s standout achievements so far is that it has become the first model to outperform human experts on massive multitask language understanding (MMLU), achieving a score of 90.0%.\u00a0<\/span><\/p>\n At the same time, Gemini Ultra has scored just above GPT-4 in a range of benchmarks, including:<\/span><\/p>\n This indicates that Gemini Ultra has a slight edge over GPT-4 in multi-step reasoning, reading comprehension, basic arithmetic manipulations, and Python<\/a> code generation.\u00a0<\/span><\/p>\n In addition, Google claims Gemini Ultra also edges out GPT-4 in multimodal performance, natural image understanding, natural image OCR, document understanding, infographic understanding, and mathematical reasoning in visual contexts.\u00a0<\/span><\/p>\n READ MORE:<\/strong><\/p>\n Gemini has also achieved a state-of-the-art score on the MMMU benchmark, which measures performance in multimodal tasks.\u00a0<\/span><\/p>\n To achieve this performance, Gemini was pre-trained on different modalities and then fine-tuned to increase the model’s ability to understand and reason about different types of inputs better than any LLM to date.\u00a0<\/span><\/p>\n
\n<\/span><\/p>\n<\/p>\n
So, Just How Good is Google Gemini?<\/span><\/h2>\n
\n
<\/p>\n
\n