3 Things to Consider When Designing Your First Voice App
“Ok Google, give me an intro about designing for voice.”
Darn. But in spite of Google’s assistant inability to help me write this blog, these modern voice assistants are actually pretty neat. And with huge amounts of investment from Amazon and Google, they are here to stay. Amazon has been aggressive with expanding it’s Alexa services this past year into new products like the Alexa Show and Google embracing an even bigger screen assistant push with it’s Nest Hub Max and the continued development of the assistant in the Android OS.
Voice is the closest thing we have so far to truly embracing the “Star Trek” future. Right now on the “Star Trek Future Scale” (STFS, obviously) we are still in our infancy, but here are some things to consider as you design and develop the voice apps that will help us grow the medium into something powerful and useful for decades to come.
1. Alexa, Google, and Siri… Oh my!
The different flavors of assistants all bring unique things to the table. We recommend before even attempting to design new experiences to take some time to understand the opportunities and limitations with each platform. Focusing in on the Alexa and Google experiences is a smart start, Siri was a non-starter with several of our projects, and Amazon and Google have excellent documentation to start out with for projects:
Using these as a baseline similar to how you might use Human Interface Guidelines or Material Design in your screen UI design helps ensure your teams are designing experiences that feel natural to the larger suite of services offered from these assistants.
With a firm understanding of the tech behind the assistant, we can move into some actual design.
2. Do your (user) homework
Voice is a new and powerful user interface, it can provide an immense amount of information with no need to worry about navigational hierarchy, where the user could or could not be looking or even what their intent is. We just have to predict it.
However, predicting WHAT they want is about as important as WHERE they will need it. Most people don’t want to shout into their phones in line at the grocery store.
The time spent into understanding your users habits and journeys will make it so when you provide that voice experience at those high value moments, it will feel precise and appropriate. The Alexa on the dresser that you can ask the weather while you are picking your clothes for the day perfectly fits into a user’s experience while making the hands free nature truly valuable. The home hub in the kitchen that can set a timer while a person’s hands are busy chopping, stirring and cooking makes for a seamless Star Trek esque experience.
Example of an overcomplicated, bad flow
3. No Fat Diet
Voice is one of the fastest and most semantic means to information that technology offers. The goal is to create a framework that provides the experience of: What you ask for, you receive. Lots of voice experience fail with great concepts because they don’t edit nearly enough. This ideal input/output experience is only possible if your voice service can get out of its own way. With voice, do your best to avoid having steps in any process. Conversational interfaces can be interesting, but they also can feel extremely like a robophone support system, and those don’t feel like Star Trek at all. They are bad. Lengthy responses are also a big no no. If the assistant is rambling on and on when a person asks a simple question, you can expect it to get the “Alexa, shut up” real quick. Navigational commands should be short, sweet and direct. Informative responses (the weather, information about a product, confirmation of an order) are best when they are short and provide means to expand if the user wants (Find out more in your ___ app, or ask for more information, things like that). Don’t answer a question with a question. While the question with a question model for education has been a very powerful one to create critical thought in students, your voice assistant should avoid this at all costs. In its current state, voice assistants still very much feel like computers, and users still treat them as such. Input -> output is a much more important experience than input->input->input->output, even if the end result output is somehow “better,” that experience is not.
That good flow
Prototype with the tech. This one we learned the hard way, but the sooner you can get the experience on a device the better. Map out your critical path and start testing it with a real Home Hub or Alexa. You’ll be able to iterate your flow much sooner, learn what words the device struggles with and really get a feel for your voice experience as your users will use it rather than it being refined in a doc until it is perfect.
Test early and often with the voice devices. Trust us, you’ll be much happier with your final product.