A brand is a fictional person, and like a person it has many unique characteristics, including voice. A brand’s voice helps users instantly identify a brand’s personality through hearing. Today, Amazon’s cloud service Amazon Polly launched the “Brand Voice” business, a fully automated service. The service can convert text content into realistic speech, providing customers with specially customized voice services. As Amazon’s head of AI voice Rafal Kuklinski and senior product manager Ankit Dhawan explained in a blog post, Brand Voice allows companies to differentiate themselves from other brands by incorporating a unique sonic signature into their products and services. “Every company can have their own unique sonic brand,” they wrote. Amazon partnered with KFC to implant the latter's brand logo "KFC Grandpa" with an English accent from the Southern United States and launched it on the Amazon Alexa App. It also designed the Australian English voice for National Australia Bank, which migrated its contact center to Amazon Connect, Amazon’s omnichannel cloud contact center product. Late last year, Amazon detailed its work on using AI to generate speech in a research paper (“The Impact of Data Reduction Effects on Text-to-Speech Conversion”), in which researchers described a system that could learn a new speaking style with just a few hours of training. To achieve the same goal, a voice actor may need dozens of hours. Amazon's AI model consists of two parts. The first is a neural network that converts a sequence of phonemes into a sequence of spectrograms, or a visual representation of the frequency spectrum of a sound over time. The second is a vocoder, which converts the spectrogram into a continuous audio signal. The method for training this AI model combines a large amount of neutral-style speech data with data in the desired style and an AI system that can distinguish between speech. Amazon already uses it internally to generate new voices for Alexa. This technology has good commercial value. The brand voice (for example, the character Fio, played by actress Stephanie Courtney) is often tasked with recording a phone tree for an interactive voice response system or an e-learning script for a corporate training video. Synthesizers can make actors more efficient by reducing auxiliary recording and listening, while freeing them up to work creatively. Amazon and Google stand out in this space with Brand Voice and other text-to-speech services. Google recently launched 31 AI-synthesized WaveNet voices and 24 new standard voices for its Cloud Text-to-Speech service. Beyond that, Amazon has another notable competitor in Microsoft, which offers three AI-generated preview voices and 75 standard voices through the Azure Speech Service API. Amazon’s Brand Voice also competes with offerings from a number of startups, such as Voicery, which offer customized digital voices that sound impressively human. Text-to-speech startup iSpeech has similar voice tools, as do Modulate, Respeecher, Resemble AI, Descript and Bengaluru-based DeepSync. ( Source: Cross-border Sellers Teahouse ) |
Whenever Amazon opens a new site, it always attra...
Tianyun International Logistics (Group) Co., Ltd. ...
How can FBA be delayed by machine translation? We...
According to TreasureData's latest survey of 1...
▶ Video account attention cross-border navigation ...
Shangjia.com is an e-commerce integrated marketing...
According to data from Brick Meets Click and Merca...
It is learned that recently, the Canadian Ministry...
Products that are shipped using Tier 1 and Tier 2 ...
UL is the abbreviation of Underwriter Laboratories...
Yamasum Commercial Consulant Limited is a cross-bo...
It is learned that on January 31, the American exp...
Sellers who have been on Amazon for a long time k...
DealAm.com is the first North American discount in...
It is learned that the e-commerce platform "...