Apr 22, 2019

How it really works... visual recognition technology

AIQ CEO: "We feel that vision tech is going to come to a point that it will replace our day-to-day behaviours."

whatsapp 0 whatsapp 0 Facebook 0 Tweet 0 linkedin 0

How it really works... visual recognition technology

Visual recognition technology (VRT) is an advancement of computer vision technology (CVT) where one can point their phone over a real-life image or object, and relevant content can be pulled up.

This enables attendees to consume or engage with content simply by pointing their phones at a physical image. The tech can be built into an existing event app or work as a standalone function. According to Marcus Tan, CEO of tech company AIQ, VRT can work as a “bridge to connect offline images to online content”.

How does VRT work?

The tech combines visual recognition with machine learning. How it does is if I have an image, what we need to do is ingest or upload an image to the backend or CMS system. The machine learning will be able to pick up points of the image and it will translate into an interactive bot. What we do is to link this unique image to a URL or to a deck, to a Facebook page, or to content.

We will work with the organisers in advance to ensure all the backdrops being used. Our CMS allows you to upload an image or video, ingest the image, and link it to a particular type of content. Once that’s linked, whenever you scan the image, the relevant content will pop up. And the backend can be anything – it can be payment, it can be a Facebook page, it can be a contest. It can change on the fly. So it works with a time-sensitive event like a travel fair.

What’s the difference between VRT and augmented reality (AR)?

AR is basically scanning your device over an item, and a 3D version pop ups with imagery that communicates with you. With AR, the amount of money that you spend is [probably] the amount you spend on a whole marketing campaign. It’s costly. The second limitation is its novelty of using it once as the content needs to be built and designed into every interaction. With VRT, you can change your content any time with no additional cost.

With VRT, it’s a simpler way of doing things. The proliferation of mobile devices today has made it an extended part of our bodies. Everything is very image-driven. We feel that vision tech is going to come to a point that it will replace our day-to-day behaviours.

Does VRT work the same way QR codes work?

QR codes are made for machines, not for human eyes. You literally don’t know what it represents. If you can interact with an image directly, why would you want to look for a QR code to interact with? If you’re interacting with a brand image or logo, you know you’re interacting with brand content and the brand equity is better.

Also, there’s no way you can fraud this because the backend is secured and controlled by us. Whereas, QR codes can be replicated, and in an event set up, security is important.

How can VRT be used across the attendee journey?

At registration, you could simplify the process by taking out your phone, opening up your ‘scanner’, and pointing at the registration counter. You can even point at an image to pay without using a QR code.

When you go into the conference hall, you’re listening to a talk so you could point your phone to the stage and ask questions in real-time. If you want a copy of the deck, all you have to do is download an app or use the event app, point at the stage and the deck will be made available on your phone.

If you go to an exhibition booth, you just have to point at a booth or logo, and you’ll be able to collect the information on your phone and go back to review it. With VRT, [clients] can even create multiple interactions at their booth, and I can even create an opportunity for them to interact using a chatbot.

During the event, you can even apply gamification. [For example], you could turn it into a Pokémon hunt. You go to an event, look for a particular booth to point your phone at, answer a question, give you a clue to the next one. Like a digital treasure hunt.

Or at museums or art galleries – the tech can be pointing at an art piece. And to interact with the art piece, the artist can pop up in a link to explain that to you. And images are language agnostic which means they can point and download a deck in their own language.

We want to cut down printing materials. 70% of the materials that you print are time-sensitive and is outdated right after the event. If I go to a booth, I just have to point at a brochure, I’ll be able to collect the information on my phone, and I can go back to review it. Or I can point at a particular logo, and they can ask ‘would you like to leave your name?’. I can even create an opportunity for you to interact using a chatbot.

Can organisers tweak content in real-time?

Content can be changed real-time. There’s no such as building a campaign that takes three weeks before launching it. I can turn on an image and link it within one minute. And all the data goes back to the client.

On a stage, the backdrops and titles change but the tech is already mature enough that when you snap an image, all you need is 30% of the image being captured and you can interact with it. The backdrop translates into real-time interaction and brings you to a live page.

Source:

CEI