Google's New Gemini AI Will Understand Your Photos and Videos, not Just Text

Google’s New Gemini AI Will Understand Your Photos and Videos, not Just Text

Google has begun bringing a native understanding of video, audio and photos to its Bard AI chatbot with a new model called Gemini.

The first incarnations of the new technology arrived Wednesday in dozens of countries, but only in English, providing text-based chat abilities that Google says improves the AI’s abilities in complex tasks like summarizing documents, reasoning and writing programming code. The bigger change with multimedia abilities, for example understanding the data underlying a graph or figuring out the result of a child’s dot-to-dot drawing puzzle, will arrive “soon,” Google said.

The new version represents a dramatic departure for AI. Text-based chat is important, but humans must process much richer information as we inhabit our three-dimensional, ever-changing world. And we respond with complex communication abilities, like speech and imagery, not just written words. Gemini is an attempt to come closer to our own fuller understanding of the world.

Gemini comes in three versions tailored for different levels of computing power, Google said:

Gemini Nano runs on mobile phones, with two varieties available built for different levels of available memory. It’ll power new features on Google’s Pixel 8 phones, like summarizing conversations in its Recorder app or suggesting message replies in WhatsApp typed with Google’s Gboard.
Gemini Pro, tuned for fast responses, runs in Google’s data centers and will power a new version of Bard, starting Wednesday.
Gemini Ultra, limited to a test group for now, will be available in a new Bard Advanced chatbot due in early 2024. Google declined to reveal pricing details, but expect to pay a premium for this top capability.

The new version spotlights the breakneck pace of advancement in the new generative AI field, where chatbots create their own responses to prompts that we write in plain language rather than arcane programming instructions. Google’s top competitor, OpenAI, stole a march with the launch of ChatGPT a year…

Read the full article here

Want to advertise or share your work with Science News Watch? Contact us.

Google’s New Gemini AI Will Understand Your Photos and Videos, not Just Text

CNET

Related Articles

Amazon Prime Day: The Best 115+ Deals Live Ahead of October’s Big Deal Days

Today’s NYT Strands Hints, Answer and Help for Oct. 5, #216

Amazon October Prime Day: The Top Early Deals on Tech, Home Goods, TVs, Appliances and More

FaceTime on Android: No, You Don’t Need an iPhone to Join a FaceTime Video Call

What Are the New Free Games on the Epic Store This Week?

There’s a Fast Way to Find Your Wi-Fi Password on Windows and Mac

Get exclusive updates

Welcome Back!

Retrieve your password

Google’s New Gemini AI Will Understand Your Photos and Videos, not Just Text

CNET

Related Articles

Amazon Prime Day: The Best 115+ Deals Live Ahead of October’s Big Deal Days

Today’s NYT Strands Hints, Answer and Help for Oct. 5, #216

Amazon October Prime Day: The Top Early Deals on Tech, Home Goods, TVs, Appliances and More

FaceTime on Android: No, You Don’t Need an iPhone to Join a FaceTime Video Call

What Are the New Free Games on the Epic Store This Week?

There’s a Fast Way to Find Your Wi-Fi Password on Windows and Mac

Topics

Get exclusive updates

Welcome Back!

Retrieve your password