The Road to Here

In our fast-paced society we consistently push the limits of technology and human computer interaction. The pace only continues to quicken in the mad rush of innovation. Today it is likely safe to assume that your company’s employees and customers expect the same.

First came the days when you needed a website to be current. It wasn’t long before static websites moved to dynamic content and web apps started to mature. Then we gradually transitioned to the Golden Age of the App. If you had an app your IT staff could check that block.

With the app ecosystem starting to become saturated you need more innovation and personalization to differentiate yourself and to give the ease of use the is demanded out of top-notch apps. Enter Speech Recognition.

Speech recognition’s future goes back quite a way too.

Hollywood has used Speech Recognition to thrill and excited us with memorable scenes including

Speech Recognition has actually been around for quite some time, but it was quite limited in scope. The proliferation of mobile phones and the maturation of Speech Recognition software and neural networks has made this a completely different ball game now. There is speculation that 2017 is the year of Voice Recognition. The error rate has dropped from 43% in 1995 to only 6.3% this year and is now on par with humans.

gartner_switchboard_dnn_breakthrough_2

Source: Benchmarks: Comparison of different architectures on TIMIT and large vocabulary tasks

Voice Search: Usage Increasing Quickly

  

Ways to Interact With Voice

There are a handful of different ways that you can utilize voice interactions to build your user experience. Which methods you choose are largely dependant on your existing assets and infrastructure, and what you want to accomplish.

Voice Assistants: Siri, Google Now, Cortana

Alexa / Google Home

Web-Based Voice Recognition

Voice Assistants: Siri, Google Now, Cortana

The Voice Assistants of yesteryear have grown up and have added a late addition to the party. They provide some cool and genuinely useful tools and integrations – but their use doesn’t stop there. Siri and Google’s assistants have opened up their platforms a bit, and Cortana is getting ready to. There are a lot of good options to integrate with these assistants

Siri

SiriKit enables your iOS 10 apps to work with Siri, so users can get things done with your content and services using just their voice. Currently they only offer interactions with the following “intents” or capabilities:

You can find more information out at Apple’s SiriKit Programming Guide.

A pretty safe bet is that Apple is in the process of opening up custom actions, largely in response to market demands.

OK Google / Google Now / Google Assistant

Google Voice Actions come in two flavors:

System Actions include the following intents that you can integrate with:

There are a lot of things that Google Voice Actions already recognize. This website is a great way to discover what’s possible.

You can define Custom Actions to support additional use cases.

Currently, custom actions are only available on GoogleHome and Pixel. Other devices will follow soon.

Cortana

From basic mobile deep links to full integration of your bots and services, the skills kit provides all the tools and docs you need to promote your services and engage users through the Cortana experience.
Once created, your skill works wherever your code runs. By registering your bots, services, mobile apps, and websites as Cortana skills, over 145 million active monthly users will be connected to these capabilities.
People can interact with your skills in various ways. Cortana can offer a skill based on a natural language request during a conversation, or proactively present a skill based on a user’s preferences and context.

Look for the Cortana Skills Kit preview in early 2017.

The Cortana Skills Kit will allow developers to:

Cortana has apps on both iOS and Android

Alexa / Google Home

The New Kids on the Block

Google Home and Amazon Echo (Alexa) are one more outlet to digitally interact with your customers. Furthermore, it is an extension to your digital brand outside of the app, still enhancing and simplifying your customer’s lives while connecting with them through digital means.

The Echo and Home are more than just speakers – they are built to help users at home, the location where the shopping experience begins. Both Alexa and Home can integrate with backend services allowing you to extend your brand. Although the market is still young, integrating with these devices can prove to be very beneficial.

Pros

Cons

Alexa Voice Services (Amazon Echo)

Alexa Voice Services: Under the hood

Google Home

Google Home is a Wi-Fi speaker that also works as a smarthome control center and an assistant for the whole family. You can use it to playback entertainment throughout your entire house, effortlessly manage every-day tasks, and ask Google what you want to know.

In-App Speech Recognition

Bring Your Own Voice (BYOV)
There are a variety of voice interaction points between the user and the app. Triggering voice interactions from within the app offer a unique method to engage your users

Pros

Cons

iOS
Here is Apple’s library to enable Speech Recognition

Android
Here is Android’s library to enable Speech Recognition

Web-Based Voice Recognition

Circling back around to where we began – we can’t leave web based voice recognition out of the equation. If you are using Chrome or Firefox you have noticed that this page supports Speech Recognition. This capability comes from the Web Speech API. Of particular note it also handles Speech Synthesis.

This has been possible for several years now but it hasn’t been put to much good use. Web-based voice recognition shares a lot of similarity with in-app voice recognition in that you have to handle everything yourself.

Voice User Interface (VUI)

A corpus of research has shown that people infer personality traits from even the briefest voice interactions. Voice is a form of Human Computer Interaction (HCI) that does exactly what the name infers: Humanizes the interactions. Because of this it is important that you take special consideration of how you communicate with the user.

Although much good advice for Graphical User Interfaces (GUIs) may apply, don’t try to simply convert your GUI into a VUI. There’s a lot more to think about.

Here are some tips for conversations, from Google about Google Assistant: (Video)

Create a persona: The “face” of the company.

Think outside the box

Context matters

In Conversation there are no Errors

Think bigger

Communication is Key

If you can communicate well, you will engage and even entertain. But it’s not clear sailing from here on out because dealing with voice interactions a lot is going on.

Additional References for Voice Design

Voice Design

Case Studies

Sign up for the Shockoe newsletter and we’ll keep you updated with the latest blogs, podcasts, and events focused on emerging mobile trends.