How to get going with Voice as the entry point for user interaction

Amazon, with their Alexa-enabled Amazon Echo and Amazon Dot devices for the home are seeing strong uptake and high levels of engagement with the devices, Max Amordeluso, EU Head of Amazon Skills Kit, said at the recent Digital Innovators’ Summit in Berlin (the presentation video is here).

“There is a proliferation of voice experiences everywhere,” Max told the audience. “We are seeing mainstream adoption right now … and it’s just going to keep growing. There are big opportunities and these are new channels you should look into.”

Apart from adoption, Amazon is also seeing “a very high level of engagement with users interacting with these devices, on average 16 times a day. This is very high when you think about it: for stationary devices … not in your pocket like your mobile for example.”

Max Amordeluso ()

It is important for publishers to be thinking about VUIs, if not already experimenting. Joining Max on stage was Tobias Hellwig, editorial developer at Spiegel Tech Labs in Germany, who is working with Alexa Skills Kit.

“Spiegel Online is always exploring new ways of storytelling and with Alexa there is a lot to discover. In my opinion voice assistants will be part of our lives and the interaction will become increasingly natural.”

His team are learning a lot through experimentation, Tobias said. This includes the fact that “content read out does not automatically sound great. [Furthermore,] You have no images, no videos, no bold or italic for emphasising…” 

The next big thing

According to Max, Amazon believes voice will be the next big thing for interacting with machines – major new disruption, in his words.

“Think about when you’re driving: you want to have recommendations on what to eat tonight, make reservations to a restaurant … or if you’re cooking, you want a recipe … you don’t necessarily want to interact with your hands… You need an immediacy that your hands, in that specific moment, cannot provide.”

When it comes to homes, voice user interfaces provide people with a novel way to interact with and control their smart homes or devices, he said.

The rise of voice

While voice is a very natural means of communicating for people, it is not that easy for computers. “It is hard to replicate… Whereas people can infer from the context what exactly the meaning of something is. If you think of a machine it is very easy to take literal meaning from a conversation. But if you do it right, it is very close to magic, it is very natural.”

The history of interfaces shows that each revolutionary step takes about a decade, according to Max. “We are now on the cusp of VUI (voice user interface) … leveraging voice for interactions between humans and magazines.”

The change presents enormous opportunities. “There are new technologies, new skills, new ways of interacting, new ways of gaining user interactions for your products and services.”

A major reason for the rise of voice today is “dramatic improvements in speech science”, including:

  • Automatic speech recognition: the machine understanding what the user is saying
  • Natural language processing: the intent with communicating something in a certain way
  • Access: Artificial Intelligence (AI) and machine learning need data to evolve – the more users there are, the more the model evolves and gets smarter
  • Cloud: The underlying infrastructure without wish the virtually unlimited storage and computing required to make this works
  • Advancements in AI and machine learning

The central entry point interacting with smart devices at home

“We have a process at Amazon called ‘work backwards’. We started thinking about Amazon Alexa several years ago. The original vision was the Star Trek computer … a computer that very naturally interacts with the user.”

The ‘work backwards’ process “starts with an idea, in this case the Star Trek computer. Then we write a press release. This is an exercise that simulates the launch of a product on Amazon’s retail website. [The press release] gets shared internally to socialise the idea internally. The person or team coming up with the idea, in writing the press release and FAQ gets into the idea in depth and elaborates on it to get buy in, [in order to] quickly start thinking about implementing the idea or vision.”

This same approach gave birth to Amazon Alexa, which is available on Alexa-enabled devices such as the Amazon Echo and Amazon Dot. Today, “in our vision voice is everywhere in the house” as the central or the entry point to interacting with smart devices.

Further advancements

Further advancements such as Echo Spatial Perception (ESP) will allow users to walk through their house while speaking, with the nearest device picking up and responding to the eventual command.

“This is just the beginning. I want to stress that voice is a stupendously complex problem. We are just getting started here … We’re inventing constantly [because] we are very invested in this technology.” 

Getting started, and how to think about it

There are currently two ways for developers and content providers to work with Amazon Alexa:

  • The Alexa Voice Service, a system that allows 3rd parties to place Alexa into their physical and/or virtual products.
  • The Alexa Skills Kit, which is available to developers to make Alexa smarter, to “teach new skills, add new features, new functionalities … think about it as voice-first apps for Alexa”

According to Max, when you interact with machines and voice only “you have to be very deliberate in your design. VUI design is a fairly new field: there are unique opportunities, but also unique challenges. As a designer you have to design in a manner that creates engagement, you don’t want to upset or bore your customers, or to be too lengthy or too competitive.

“So where do I start? You start simple. Start from a point of view where you have your core functionalities. You have a simple product or interaction that does exactly what it is supposed to do. Natural conversation, if you start designing … use other human beings to try the conversation out to see if it feels natural or robotic or too lengthy … Then utilise APIs and all the features the Alexa Echo system allows you.

“Concentrate on your core business first, then you can expand the functionality later on and evolve your products to address and satisfy your customers,” said Max. 

Think of it in process terms.

“You crawl, you walk [and then] you run.”

More like this

[Congress Q&A] How prepared are you for another fundamental shift in how your audiences behave?

A list of all the stories from DIS 2017 in Berlin

10 of the key takeaways from DIS 2017

Your first step to joining FIPP's global community of media leaders

Sign up to FIPP World x