Google Lens now supports video search and multimodal AI - The Verge

Google Lens Now Supports Video Search, And It’s Kind Of Awesome!

Google Lens now supports video search and multimodal AI - The Verge

With the introduction of Circle to Search, the once-outdated Google Lens became a lot more viable. Now, Google has finally gifted Lens the ability to Search with Video as well as voice support for you to add vocal prompts accompanying the video.

We saw a demo of this feature at Google I/O 2024, where Google showcased the tool identifying the different species of fish in an aquarium. Looked cool, for sure, but I had to see it for myself to make sure it was worth using. So, from trying to identify an action figure lying around my room to getting book suggestions out of it and more, I tested it out in quite a few scenarios!

Google Lens Video Search Is a Breeze to Use

To use the feature, you will need an Android or iOS device. I have received the feature on both my OnePlus 11R and Pixel 9 Pro Fold. The feature has not yet arrived on the web, and probably won’t. Anyway, once you head to the Google Lens app, simply hold down the search button and that will trigger the new Search with Video feature.

Using Google Lens to Identify action figure
Using Search with Video in Google Lens (Image Credit: Sagnik Das Gupta/ Beebom)

You will be asked to, “Speak now to ask about this video. Once you do, Lens will draw up an AI Overview alongside search results based on the video input and your voice prompt. It’s as simple as that. But, how well does the tool work? Is it reliable?

Good Enough with a Few Hiccups

The first thing I did with this new feature in Google Lens, as you can see above, was ask it to identify a figurine I had of Gojo Satoru from Jujutsu Kaisen. And it did so perfectly and rather quickly. Then, I took three different products (a jar of instant coffee, a bottle of shampoo, and mouthwash) and showed them to Google Lens one by one to see if it would identify them.

I was surprised to see it identified most of those products accurately; not all of them, of course. And here’s where I started seeing the point of having Search with Video in Google Lens. With photos, you are quite restricted as you have to show it all in one go. With videos, you’re not restricted like that and can easily cover the product or situation more easily.

For example, if your child got scraped while playing, you could record the wound and ask Google Lens for the right remedies.

  • Looking for Philips Trimmer Charger with Google Lens Search with Video
  • Identifying a book using Google Lens
  • Identifying different gadgets Google Lens

Taking my small test further, I asked the tool to identify a book and suggest similar titles, which it did perfectly as well. I showed it the elusive charging port of my Philips trimmer and it identified that as well.

However, when it comes to translations, there are some problems. Yesterday, at the Google for India event, I tested out Gemini’s new Indic language capabilities to generate a short story about “A planet where it rains glass,” in Hindi and even got a printout of it. And when I used Google Lens to translate it into English, AI Overviews started hallucinating big time.

But when I carried out the same search by simply using the photo tool in Google Lens and vocalizing my prompt using voice search, it translated it nicely. I gave it multiple tries, and the results were the same every single time. So, I guess that the new Google Lens Search with Video feature needs to be more optimized with voice prompts on translation parameters.

Also, in another instance, it identified the HMD Skyline as the Nokia XR20 and the Galaxy Watch Ultra as simply a “Samsung Galaxy Watch.” However, it identified the other two products correctly.

Imperfect but Awesome

So, while it’s not the most reliable in certain scenarios, the fact that it exists is proof of how far we’ve come in terms of multimodality in AI models. Moreover, Google is also working on the tool’s ability to identify sounds, like being able to identify animal sounds.

Besides, it’s good to have a companion at all times to point it toward things and ask questions. And 8 out of 10 times, it will give you the answers that you’re looking for. Also, with shopping ads creeping their way into AI Overviews, I can see how it can become someone’s one-stop product finder.

AI models that can process on-screen data are a big thing now. Take Microsoft’s announcement of the Click to Do feature. Google is definitely ahead of the crowd in this regard. Moreover, as Google has stated, the captured videos get deleted as soon as Gemini is done analyzing them. So, users won’t have to be alarmed about their videos being used to train the model.

That being said, I certainly had a great time testing out the new Search with Video feature in Google Lens and would like to know your thoughts. So, take to the comments below to share your opinions!

How To Free Up Storage In Chromebook
Who Is Nicholas Scratch In Agatha All Along And Marvel Comics
The Last Of Us Season 2 Trailer Out And It’ll Likely Be An Emotional Ride

Google Lens now supports video search and multimodal AI - The Verge
Google Lens now supports video search and multimodal AI - The Verge
Google Lens now lets you search with your voice and images | TechRadar
Google Lens now lets you search with your voice and images | TechRadar
What is Google Lens and what does it used for? | Blackview Blog
What is Google Lens and what does it used for? | Blackview Blog