“Hey, Meta. Take a look at this and tell me which of these teas is caffeine-free.”
I spoke these words as I wore a pair of Meta Ray-Bans at the tech giant’s New York headquarters, while I stared at a table with four tea packets with their caffeine labels blacked out with a Magic Marker. A little click sound in my ears was followed by Meta’s AI voice telling me that the chamomile tea was likely caffeine-free. It was reading the labels and making judgments using generative AI.
I was demoing a feature that’s rolling out to Meta’s second-generation Ray-Ban glasses starting today, a feature that Meta CEO Mark Zuckerberg had already promised in September when the new glasses were announced. The AI features, which can access Meta’s on-glasses cameras to look at images and interpret them with generative AI, were supposed to launch in 2024. Meta has moved to introduce these features a lot faster than I expected, although the early-access mode is still very much a beta. Along with adding Bing-powered search into Ray-Bans as part of a new update, which up the power of the glasses’ already available voice-enabled capabilities, Meta’s glasses are starting to gain a number of new abilities fast.
I was pretty wowed by the demo because I had never seen anything like it. I have in parts: Google Lens and other on-phone tools use cameras and AI together already, and Google Glass — a decade ago — had some translation tools. That said, the easy-access way that Meta’s glasses have of invoking AI to identify things in the world around me feels pretty advanced. I’m excited to try it a lot more.
Multimodal AI: How it works right now
The feature has limits right now. It can only recognize what you see by taking a photo, which the AI then analyzes. You can hear the shutter snap after making a voice request, and there’s a pause of a few…
Read the full article here