Babar Khan Javed
Apr 19, 2018

Why Google's claims about audio transcription matter for marketing

Google says it has nearly perfected the ability to transcribe audio into text, which has potential to impact voice search, retail environments and even creative teams.

Why Google's claims about audio transcription matter for marketing

A deep learning audiovisual model from Google could impact voice-search, retail and creative production.

Announced on the Research Blog, the method can, according to Google, identify audio found in a video by isolating spoken words and distinguishing between language in the foreground and background.

Applied to YouTube, the model could potentially eliminate the need for creators to manually transcribe and caption their content, a common practice for maximising both user enjoyment and search-engine optimisation.

Researchers behind the model believe it will have a range of applications, from speech enhancement and recognition in videos and voice search to videoconferencing and the ability to improve hearing aids.

"In the near term, this will streamline video production—especially valuable in mobile first-video where lesser speaker quality makes clean mixing critical for comprehension," said Patrick Givens, VP of VaynerSmart at VaynerMedia. "Looking into the future, as we see more consumer attention migrating to audio-first channels this will also ease the burden of audio production."

Advertisers and agencies scrambling to optimize for voice-based search also see promise.

"The tip of the iceberg in big data is the analysis, while data collection is below the surface," said Danish Ayub, CEO of MWM Studioz. "Similarly, with voice-search optimization, the part of the work you don't see is the hours of manpower that go into transcribing the video content to ensure searchability."

Ayub adds that the technology could eliminate the need for both transcribers and paid software that can convert audio into text.

Nate Shurilla, regional head of innovation at iProspect APAC believes that the model has far-reaching implications for retail.

"Imagine walking into any fast food joint and just announcing what you would like into the air, sitting down, and having your order brought to you, all while dozens of other customers are doing the same and getting their respective orders," said Shurilla. "That’s a big boost in efficiency." He added that at the same time, the technology would effect surveillance. "I’ll just leave that one to your imagination,” he said.

Shaad Hamid, head of SEO for Southeast Asia at APD believes that in the short term there will be more use cases for improving live-streaming of events, videoconferencing, hearing-aid devices, virtual assistants and any other application where multiple and simultaneous speech can cause audio quality to be compromised.

"From an advertiser’s perspective, using this technology, we can create videos that target multiple audiences with a single asset, saving time and reducing production costs while speeding up the campaign setup," he said.

For example, Hamid envisioned a property portal being able to tone down or dial up different audio within the same video depending on what the user is observed to be in the market for.

On the other hand, Hamid offered a word of caution. "Since no one’s really seen or heard how this type of ad will look or sound, it’s actual effectiveness as a technique for advertisers is anybody’s guess," he concluded.

Related Articles

Just Published

4 hours ago

US greenlights TikTok-Oracle deal; fate now lies ...

Under the arrangement that has been waved through by President Trump, ByteDance will create a new US-headquartered company called TikTok Global that will be majority owned by US investors.

2 days ago

Campaign Crash Course: What exactly is diversity?

The industry talks about diversity a lot, but do we understand the true definition of diversity, the difference between inherent and acquired? Find out, and test your knowledge with a quiz.

2 days ago

40 Under 40 2020 opens for entries

Calling all rising stars and those destined to make a big mark in APAC's marketing, media and advertising arena: Nominations are now open for our eighth-annual list of standouts who are 39 or under.

2 days ago

Agency launches internship for 55+ cohort

Thinkerbell's Thrive@55 internship seeks to offer an entry point for members of a "massively underrepresented" age group.