{"id":93851,"date":"2024-04-29T23:27:51","date_gmt":"2024-04-29T23:27:51","guid":{"rendered":"https:\/\/entertainment.runfyers.com\/index.php\/2024\/04\/29\/google-gemini-everything-you-need-to-know-about-the-new-generative-ai-platform-techcrunch-3\/"},"modified":"2024-04-29T23:27:51","modified_gmt":"2024-04-29T23:27:51","slug":"google-gemini-everything-you-need-to-know-about-the-new-generative-ai-platform-techcrunch-3","status":"publish","type":"post","link":"https:\/\/entertainment.runfyers.com\/index.php\/2024\/04\/29\/google-gemini-everything-you-need-to-know-about-the-new-generative-ai-platform-techcrunch-3\/","title":{"rendered":"Google Gemini: Everything you need to know about the new generative AI platform | TechCrunch"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div>\n<p id=\"speakable-summary\">Google\u2019s trying to make waves with Gemini, its flagship suite of generative AI models, apps and services.<\/p>\n<p>So what is Gemini? How can you use it? And how does it <a href=\"https:\/\/techcrunch.com\/2024\/02\/15\/we-tested-googles-gemini-chatbot-heres-how-it-performed\/\" target=\"_blank\" rel=\"noopener\">stack up to the competition<\/a>?<\/p>\n<p>To make it easier to keep up with the latest Gemini developments, we\u2019ve put together this handy guide, which we\u2019ll keep updated as new Gemini models, features and news about Google\u2019s plans for Gemini are released.<\/p>\n<h2>What is Gemini?<\/h2>\n<p id=\"speakable-summary\">Gemini is Google\u2019s <a href=\"https:\/\/www.wired.com\/story\/google-deepmind-demis-hassabis-chatgpt\/\" target=\"_blank\" rel=\"noopener\" data-mrf-link=\"https:\/\/www.wired.com\/story\/google-deepmind-demis-hassabis-chatgpt\/\">long-promised<\/a>, next-gen GenAI model family, developed by Google\u2019s AI research labs DeepMind and Google Research. It comes in three flavors:<\/p>\n<ul>\n<li><strong>Gemini Ultra<\/strong>, the most performant Gemini model.<\/li>\n<li><strong>Gemini Pro<\/strong>, a \u201clite\u201d Gemini model.<\/li>\n<li><strong>Gemini Nano<\/strong>, a smaller \u201cdistilled\u201d model that runs on mobile devices like the <a href=\"https:\/\/techcrunch.com\/2023\/10\/04\/google-pixel-8-pro-first-impressions\/\" data-mrf-link=\"https:\/\/techcrunch.com\/2023\/10\/04\/google-pixel-8-pro-first-impressions\/\" target=\"_blank\" rel=\"noopener\">Pixel 8 Pro<\/a>.<\/li>\n<\/ul>\n<p>All Gemini models were trained to be \u201cnatively multimodal\u201d \u2014 in other words, able to work with and use more than just words. They were pretrained and fine-tuned on a variety of audio, images and videos, a large set of codebases and text in different languages.<\/p>\n<p>This sets Gemini apart from models such as Google\u2019s own <a href=\"https:\/\/techcrunch.com\/2022\/08\/25\/googles-new-app-lets-you-experimental-ai-systems-like-lamda\/\" target=\"_blank\" rel=\"noopener\">LaMDA<\/a>, which was trained exclusively on text data. LaMDA can\u2019t understand or generate anything other than text (e.g., essays, email drafts), but that isn\u2019t the case with Gemini models.<\/p>\n<h2>What\u2019s the difference between the Gemini apps and Gemini models?<\/h2>\n<div id=\"attachment_2601757\" style=\"width: 1034px\" class=\"wp-caption aligncenter\"><\/p>\n<p id=\"caption-attachment-2601757\" class=\"wp-caption-text\"><strong>Image Credits:<\/strong> Google<\/p>\n<\/div>\n<p>Google, proving <a href=\"https:\/\/techcrunch.com\/2020\/10\/06\/googles-new-logos-are-bad\/\" target=\"_blank\" rel=\"noopener\">once again<\/a> that it lacks a knack for branding, didn\u2019t make it clear from the outset that Gemini is separate and distinct from the Gemini apps on the web and mobile (formerly Bard). The Gemini apps are simply an interface through which certain Gemini models can be accessed \u2014 think of it as a client for Google\u2019s GenAI.<\/p>\n<p>Incidentally, the Gemini apps and models are also totally independent from <a href=\"https:\/\/techcrunch.com\/2023\/12\/13\/google-debuts-imagen-2-with-text-and-logo-generation\/\" target=\"_blank\" rel=\"noopener\">Imagen 2<\/a>, Google\u2019s text-to-image model that\u2019s available in some of the company\u2019s dev tools and environments.<\/p>\n<h2>What can Gemini do?<\/h2>\n<p>Because the Gemini models are multimodal, they can in theory perform a range of multimodal tasks, from transcribing speech to captioning images and videos to generating artwork. Some of these capabilities have reached the product stage yet (more on that later), and Google\u2019s promising all of them \u2014 and more \u2014 at some point in the not-too-distant future.<\/p>\n<p>Of course, it\u2019s a bit hard to take the company at its word.<\/p>\n<p>Google <a href=\"https:\/\/techcrunch.com\/2023\/02\/10\/google-is-losing-control\/\" target=\"_blank\" rel=\"noopener\">seriously underdelivered<\/a> with the original Bard launch. And more recently it ruffled feathers<a href=\"https:\/\/techcrunch.com\/2023\/12\/07\/googles-best-gemini-demo-was-faked\/\" target=\"_blank\" rel=\"noopener\"> with a video purporting to show Gemini\u2019s capabilities<\/a> that turned out to have been heavily doctored and was more or less aspirational.<\/p>\n<p>Still, assuming Google is being more or less truthful with its claims, here\u2019s what the different tiers of Gemini will be able to do once they reach their full potential:<\/p>\n<h3>Gemini Ultra<\/h3>\n<p>Google says that <a href=\"https:\/\/techcrunch.com\/2024\/02\/08\/google-goes-all-in-on-gemini-and-launches-20-paid-tier-for-gemini-ultra\/\" target=\"_blank\" rel=\"noopener\">Gemini Ultra<\/a> \u2014 thanks to its multimodality \u2014 can be used to help with things like physics homework, solving problems step-by-step on a worksheet and pointing out possible mistakes in already filled-in answers.<\/p>\n<p>Gemini Ultra can also be applied to tasks such as identifying scientific papers relevant to a particular problem, Google says \u2014 extracting information from those papers and \u201cupdating\u201d a chart from one by generating the formulas necessary to re-create the chart with more recent data.<\/p>\n<p>Gemini Ultra technically supports image generation, as alluded to earlier. But that capability hasn\u2019t made its way into the productized version of the model yet \u2014 perhaps because the mechanism is more complex than how apps such as <a href=\"https:\/\/techcrunch.com\/tag\/chatgpt\/\" target=\"_blank\" rel=\"noopener\">ChatGPT<\/a> generate images. Rather than feed prompts to an image generator (like <a href=\"https:\/\/techcrunch.com\/2023\/11\/06\/openai-launches-dall-e-3-api-new-text-to-speech-models\/\" data-mrf-link=\"https:\/\/techcrunch.com\/2023\/11\/06\/openai-launches-dall-e-3-api-new-text-to-speech-models\/\" target=\"_blank\" rel=\"noopener\">DALL-E 3<\/a>, in ChatGPT\u2019s case), Gemini outputs images \u201cnatively,\u201d without an intermediary step.<\/p>\n<p>Gemini Ultra is available as an API through Vertex AI, Google\u2019s fully managed AI developer platform, and AI Studio, Google\u2019s web-based tool for app and platform developers. It also powers the Gemini apps \u2014 but not for free. Access to Gemini Ultra through what Google calls Gemini Advanced requires subscribing to the Google One AI Premium Plan, priced at $20 per month.<\/p>\n<p>The AI Premium Plan also connects Gemini to your wider Google Workspace account \u2014 think emails in Gmail, documents in Docs, presentations in Sheets and Google Meet recordings. That\u2019s useful for, say, summarizing emails or having Gemini capture notes during a video call.<\/p>\n<h3>Gemini Pro<\/h3>\n<p>Google says that Gemini Pro is an improvement over LaMDA in its reasoning, planning and understanding capabilities.<\/p>\n<p>An independent <a href=\"https:\/\/arxiv.org\/pdf\/2312.11444.pdf\" target=\"_blank\" rel=\"noopener\">study<\/a> by Carnegie Mellon and BerriAI researchers found that the initial version of Gemini Pro was indeed better than OpenAI\u2019s <a href=\"https:\/\/techcrunch.com\/2023\/08\/22\/openai-brings-fine-tuning-to-gpt-3-5-turbo\/\" target=\"_blank\" rel=\"noopener\">GPT-3.5<\/a> at handling longer and more complex reasoning chains. But the study also found that, like all large language models, this version of Gemini Pro particularly struggled with mathematics problems involving several digits, and <a href=\"https:\/\/techcrunch.com\/2023\/12\/07\/early-impressions-of-googles-gemini-arent-great\/\" target=\"_blank\" rel=\"noopener\">users found examples<\/a> of <a href=\"https:\/\/techcrunch.com\/2024\/02\/11\/googles-and-microsofts-chatbots-are-making-up-super-bowl-stats\/\" target=\"_blank\" rel=\"noopener\">bad reasoning<\/a> and obvious mistakes.<\/p>\n<p>Google promised remedies, though \u2014 and the first arrived in the form of <a href=\"https:\/\/techcrunch.com\/2024\/02\/15\/googles-new-gemini-model-can-analyze-an-hour-long-video-but-few-people-can-use-it\/\" target=\"_blank\" rel=\"noopener\">Gemini 1.5 Pro<\/a>.<\/p>\n<p>Designed to be a drop-in replacement, Gemini 1.5 Pro is improved in a number of areas compared with its predecessor, perhaps most significantly in the amount of data that it can process. Gemini 1.5 Pro can take in ~700,000 words, or ~30,000 lines of code \u2014 35x the amount Gemini 1.0 Pro can handle. And \u2014 the model being multimodal \u2014 it\u2019s not limited to text. Gemini 1.5 Pro can analyze up to 11 hours of audio or an hour of video in a variety of different languages, albeit slowly (e.g., searching for a scene in a one-hour video takes 30 seconds to a minute of processing).<\/p>\n<p>Gemini 1.5 Pro <a href=\"https:\/\/techcrunch.com\/2024\/04\/09\/googles-gemini-pro-1-5-enters-public-preview-on-vertex-ai\/\" target=\"_blank\" rel=\"noopener\">entered public preview on Vertex AI in April<\/a>.<\/p>\n<p>An additional endpoint, Gemini Pro Vision, can process text <em>and<\/em> imagery \u2014 including photos and video \u2014 and output text along the lines of OpenAI\u2019s <a href=\"https:\/\/techcrunch.com\/2023\/09\/26\/openais-gpt-4-with-vision-still-has-flaws-paper-reveals\/\" target=\"_blank\" rel=\"noopener\">GPT-4 with Vision<\/a> model.<\/p>\n<div id=\"attachment_2648159\" style=\"width: 929px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-2648159\" class=\"size-full wp-image-2648159\" src=\"https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/01\/structured_prompt.png\" alt=\"Gemini\" width=\"919\" height=\"600\" srcset=\"https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/01\/structured_prompt.png 919w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/01\/structured_prompt.png?resize=150,98 150w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/01\/structured_prompt.png?resize=300,196 300w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/01\/structured_prompt.png?resize=768,501 768w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/01\/structured_prompt.png?resize=680,444 680w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/01\/structured_prompt.png?resize=50,33 50w\" sizes=\"auto, (max-width: 919px) 100vw, 919px\"\/><\/p>\n<p id=\"caption-attachment-2648159\" class=\"wp-caption-text\">Using Gemini Pro in Vertex AI. <strong>Image Credits:<\/strong> Gemini<\/p>\n<\/div>\n<p>Within Vertex AI, developers can customize Gemini Pro to specific contexts and use cases using a fine-tuning or \u201cgrounding\u201d process. Gemini Pro can also be connected to external, third-party APIs to perform particular actions.<\/p>\n<p>In AI Studio, there\u2019s workflows for creating structured chat prompts using Gemini Pro. Developers have access to both Gemini Pro and the Gemini Pro Vision endpoints, and they can adjust the model temperature to control the output\u2019s creative range and provide examples to give tone and style instructions \u2014 and also tune the safety settings.<\/p>\n<div class=\"container__access-control\">\n<h3>Gemini Nano<\/h3>\n<p>Gemini Nano is a much smaller version of the Gemini Pro and Ultra models, and it\u2019s efficient enough to run directly on (some) phones instead of sending the task to a server somewhere. So far, it powers a couple of features on the Pixel 8 Pro, Pixel 8 and Samsung Galaxy S24, including Summarize in Recorder and Smart Reply in Gboard.<\/p>\n<p>The Recorder app, which lets users push a button to record and transcribe audio, includes a Gemini-powered summary of your recorded conversations, interviews, presentations and other snippets. Users get these summaries even if they don\u2019t have a signal or Wi-Fi connection available \u2014 and in a nod to privacy, no data leaves their phone in the process.<\/p>\n<p>Gemini Nano is also in Gboard, Google\u2019s keyboard app. There, it powers a feature called Smart Reply, which helps to suggest the next thing you\u2019ll want to say when having a conversation in a messaging app. The feature initially only works with WhatsApp but will come to more apps over time, Google says.<\/p>\n<p>And in the Google Messages app on supported devices, Nano enables Magic Compose, which can craft messages in styles like \u201cexcited,\u201d \u201cformal\u201d and \u201clyrical.\u201d<\/p>\n<h2>Is Gemini better than OpenAI\u2019s GPT-4?<\/h2>\n<p>Google has several times <a href=\"https:\/\/blog.google\/technology\/ai\/google-gemini-ai\/\" target=\"_blank\" rel=\"noopener\">touted<\/a> Gemini\u2019s superiority on benchmarks, claiming that Gemini Ultra exceeds current state-of-the-art results on \u201c30 of the 32 widely used academic benchmarks used in large language model research and development.\u201d The company says that Gemini 1.5 Pro, meanwhile, is more capable at tasks like summarizing content, brainstorming and writing than Gemini Ultra in some scenarios; presumably this will change with the release of the next Ultra model.<\/p>\n<p>But leaving aside the question of whether benchmarks really indicate a better model, the scores Google points to appear to be only marginally better than OpenAI\u2019s corresponding models. And \u2014 as mentioned earlier \u2014 some early impressions haven\u2019t been great, with <a href=\"https:\/\/techcrunch.com\/2023\/12\/07\/early-impressions-of-googles-gemini-arent-great\/\" target=\"_blank\" rel=\"noopener\">users <\/a>and <a href=\"https:\/\/arxiv.org\/pdf\/2312.11444.pdf\" target=\"_blank\" rel=\"noopener\">academics<\/a> pointing out that the older version of Gemini Pro tends to get basic facts wrong, struggles with translations and gives poor coding suggestions.<\/p>\n<h2>How much does Gemini cost?<\/h2>\n<p>Gemini 1.5 Pro is free to use in the Gemini apps and, for now, AI Studio and Vertex AI.<\/p>\n<\/div>\n<p>Once Gemini 1.5 Pro exits preview in Vertex, however, the model will cost $0.0025 per character while output will cost $0.00005 per character. Vertex customers pay per 1,000 characters (about 140 to 250 words) and, in the case of models like Gemini Pro Vision, per image ($0.0025).<\/p>\n<p>Let\u2019s assume a 500-word article contains 2,000 characters. Summarizing that article with Gemini 1.5 Pro would cost $5. Meanwhile, generating an article of a similar length would cost $0.1.<\/p>\n<p>Ultra pricing has yet to be announced.<\/p>\n<div class=\"container__access-control\">\n<h2>Where can you try Gemini?<\/h2>\n<h3>Gemini Pro<\/h3>\n<p>The easiest place to experience Gemini Pro is in <a href=\"https:\/\/techcrunch.com\/tag\/bard\/\" data-mrf-link=\"https:\/\/techcrunch.com\/tag\/bard\/\" target=\"_blank\" rel=\"noopener\">the Gemini apps<\/a>. Pro and Ultra are answering queries in a range of languages.<\/p>\n<p>Gemini Pro and Ultra are also <a href=\"https:\/\/techcrunch.com\/2023\/12\/13\/google-brings-gemini-pro-to-vertex-ai\/\" target=\"_blank\" rel=\"noopener\">accessible<\/a> in preview in Vertex AI via an API. The API is free to use \u201cwithin limits\u201d for the time being and supports certain regions, including Europe, as well as features like chat functionality and filtering.<\/p>\n<p>Elsewhere, Gemini Pro and Ultra can be <a href=\"https:\/\/techcrunch.com\/2023\/12\/13\/with-ai-studio-google-launches-an-easy-to-use-tool-for-developing-apps-and-chatbots-based-on-its-gemini-model\/\" target=\"_blank\" rel=\"noopener\">found<\/a> in AI Studio. Using the service, developers can iterate prompts and Gemini-based chatbots and then get API keys to use them in their apps \u2014 or export the code to a more fully featured IDE.<\/p>\n<p id=\"speakable-summary\">Code Assist (formerly <a href=\"https:\/\/cloud.google.com\/duet-ai?hl=en\" target=\"_blank\" rel=\"noopener\" data-mrf-link=\"https:\/\/cloud.google.com\/duet-ai?hl=en\">Duet AI for Developers<\/a>), Google\u2019s suite of AI-powered assistance tools for code completion and generation, is using Gemini models. Developers can perform \u201clarge-scale\u201d changes across codebases, for example updating cross-file dependencies and reviewing large chunks of code.<\/p>\n<p>Google\u2019s brought Gemini models to its <a href=\"https:\/\/techcrunch.com\/2024\/02\/15\/google-makes-more-gemini-models-available-to-developers-but-needs-to-work-on-its-branding\/\" target=\"_blank\" rel=\"noopener\">dev tools<\/a> for Chrome and Firebase mobile dev platform, and its <a href=\"https:\/\/techcrunch.com\/2024\/04\/09\/googles-gemini-comes-to-databases\/\" target=\"_blank\" rel=\"noopener\">database creation and management tools<\/a>. And it\u2019s <a href=\"https:\/\/techcrunch.com\/2024\/04\/09\/google-injects-generative-ai-into-its-cloud-security-tools\/\" target=\"_blank\" rel=\"noopener\">launched new security products underpinned by Gemini<\/a>, like <span style=\"font-size: 1rem; letter-spacing: -0.1px;\">Gemini in Threat Intelligence, a component of Google\u2019s Mandiant cybersecurity platform that can analyze large portions of potentially malicious code and let users perform natural language searches for ongoing threats or indicators of compromise.<\/span><\/p>\n<\/div><\/div>\n<p><br \/>\n<br \/><a href=\"https:\/\/techcrunch.com\/2024\/04\/29\/what-is-google-gemini-ai\/\" target=\"_blank\" rel=\"noopener\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Google\u2019s trying to make waves with Gemini, its flagship suite of generative AI models, apps and services. So what is Gemini? How can you use it? And how does it stack up to the competition? To make it easier to keep up with the latest Gemini developments, we\u2019ve put together this handy guide, which we\u2019ll [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":93852,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14],"tags":[],"class_list":{"0":"post-93851","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-tech"},"_links":{"self":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts\/93851","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/comments?post=93851"}],"version-history":[{"count":0,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts\/93851\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/media\/93852"}],"wp:attachment":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/media?parent=93851"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/categories?post=93851"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/tags?post=93851"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}