Alexa, or the progress of automated conversation technology
Alexa, the voice service of the Amazon Echo personal assistant, has taken the world by storm, with Amazon selling some 10 million Echo-Alexa systems. At first glance, the figure may seem slight compared to the sales posted for Siri or the Google assistant, which come standard with hundreds of thousands of smartphones. But Amazon has big plans for Alexa, predicting its service will become THE most widely-used voice assistant. Needless to say, Amazon has the means and resources to back that up. And how will it ensure Alexa’s success? Thanks to its development platform, Amazon Lex.
… and big brother Lex
Amazon Lex, which had been in preview phase since late 2016, has just been made freely available to all developers so that they can build automated conversation technology, or chatbots, into their own applications.
“There's massive acceleration happening here”, said Amazon’ CTO, Werner Vogels. “The cool thing about having this running as a service in the cloud instead of in your own data center or on your own desktop is that we can make Lex better continuously by the millions of customers that are using it.”
Lex has turned into a new revenue stream for Amazon, which charges developers based on the number of voice or text requests their applications send to its platform to process. But a greater source of income could be chatbot e-commerce applications, a new but growing trend.
Towards more “natural” language
Another of Alexa’s advantages is its ability to evolve and adapt to conversation context. Amazon’s developers are making full use of a standardized markup language called Speech Synthesis Markup Language, or SSML, which provides for new communication capabilities that go beyond just using words.
In the very near future, Alexa will be able to whisper, replace words that are part of its “script”, swear or on the contrary “bleep” out swearwords, adapt the speed and volume of any part of a conversation to stress any particular part of speech, pause, and change intonation. Curious geeks can test this feature in a new quiz game.
And one final innovation that further “humanizes” Alexa, which already cracks jokes and sings, is the introduction of new “speechcons” for markets other than the U.S.
Speechcons are special words and phrases that voice assistants know to express in a more colourful way to make their interactions with humans more engaging and personal. Alexa already knows several in English, for example “Yay!”, spoken enthusiastically and with the stress on the final vowel, or “Abracadabra”, whose tonic stress is not intuitive.
Alexa will soon add regional expressions to its vocabulary, for example “Blimey!” for British users. And for the first time, new speechcons have been added in a foreign language, but not an Asian one! German is a language that is particularly rich in idiomatic expressions, such as “Na und?” (“Yes, and…?”), which must be pronounced half defiantly, half enviously.
Be it Alexa or another product, vocal technology will remain a major trend in the IT ecosystem in years to come: “Voice is a big part of the computer interface of the future,” said Gene Munster, an analyst at Loup Ventures. “Whoever owns voice will be the gateway of commerce.”