We often hear this question: What is a voice user interface? A voice user interface is what users interact with when issuing voice commands to a voice assistant like Siri, Alexa, Cortana or Google Assistant. When searching for online content, not all online content is available for voice search queries because not all content online is optimized for VUI. The truth is that only top players have been digging deep into VUI. Similarly, the number of mainstream voice assistant players in use across cultures has been growing.
The reason content producers are optimizing for VUI is that more and more content is being consumed in a similar way that other searches are being done, via a go-to-voice assistant within the home, from bluetooth connected to a voice assistant device or mobile, or from a laptop voice assistant.
A VUI makes use of layers that together create a successful voice interaction. Each layer draws from the layer beneath it. The top layer is a voice assistant app in the cloud, while the middle layer is the platform the device is operating on (the operating system holding the user copy/profile of the app), and the bottom layer is the device itself which physically interacts with the voice of the user. A working voice layer has to be able to navigate through all layers successfully.
Obviously, Alexa’s voice assistant app must first be compatible with the operating system on the device within the second layer for any of this to work, while permissions on the device must be managed effectively via the app for optimal performance (microphone permissions, location permissions, etc.). The user must have a profile in the cloud to even begin to have a tailored user experience and search or playback history, preferences, playlists, enabled “skills”, etc.
How does a typical voice query connect users to content?
Alexa’s “skills” model is just one of the ways a voice assistant is able to connect the user to content on the Web. Other ways include the “preferences” and “categories” setup on Google Assistant. But the most open-ended model is where VUIs begin to become more inclusive. For example, whenever I ask Alexa “Play the BBC Minute”, I’m asking for the daily briefing by name. Context is key in search queries, however. So if I ask Alexa for BBC News, I might get something BBC World Service podcast I’d previously consumed or saved from the “Global News Podcasts” category within my connected Apple Podcasts skill.
Ultimately, the ways voice assistants can connect are evolving quietly in the background, but the method used will depend upon the choices already accessible via the total user history. A single shared household account across multiple devices may produce similar results, but voice profiles within that single user account can still potentially differentiate among the history data to produce unique end-user search request results. Also, since much of voice assistant usage is oriented to the trial and error model, it is anticipated that the voice assistant will get some things wrong and course-correct over time.
Each voice assistant will handle some types of requests very differently. Over time, we can expect a common ground to emerge for general search queries and for skills (and all variations of this method) to slowly phase out.
By utilizing the VUI resulting from these layered interactions from within the cloud/app layer, we now have the possibility of creating “speakable” content that is not only searchable on more than one medium, but is able to combine also with structured data to produce microformats that can be consumed in each of different mediums, as well. It is likely that other format layers will also emerge, but for now voice is the layer most content-makers and content-marketers are working on. Voice optimization is the opportunity to make that content more available to more content consumers within a given target audience.
Why voice is adding to the total search input mix
As early as 2017, nearly half of Americans were already using voice assistants according to Pew Research. More and more, inexpensive voice assistants are being sold via Amazon, Apple store, Walmart and other major wide-focus retailers. You probably are using one by now yourself, no matter who you are, due to unspoken/indirect peer pressure behavior. Smart home technologies also drive spoken commands via a voice assistant, and thus voice assistant use.
Voice assistants free up the user to multitask. As the previously quoted Pew Research study also demonstrates, if a user can get something done with less work, that user will tend to opt for that method, be it at work, play, or rest. Employers are likely to encourage time-saving measures like voice assistants for work-related research and communications (many give out work phones to their employees and allow them to use Cortana or Google Assistant, for example).
VUI use is not mere laziness, but a matter of personal time/resource management. Users with physical handicaps also tend to benefit, such as those with dyslexia or vision impairment. And when users know of the technology and it becomes increasingly inexpensive to buy, then they will buy it to try it out and seek to justify the purchase by actually using it whenever possible.
This is all part of the 2020s need to “hack” every aspect of our lives, to optimize, to improve, to squeeze all the value out of every moment of every day. Smart assistants were simply a good technological idea that came to fruition, and surprisingly slow to roll out into mass use. Now is the time when content producers must finally deal with the age of voice search however, and rise to the occasion with voiceable content.
The business argument here is that consumers are already using voice assistants. Since Google mobile analytics already show how much traffic is mobile (AKA, anything other than a desktop or laptop), that means that the high statistics showing voice assistant use (at least half done on mobile devices), will be able to indicate voice assistant use or already are showing such use in metrics via the CTAs or pageviews activated.
While viable KPI metrics for VUI are in the pre-customized and experimental stages, the popularity statistics already show the user is out there. Those able to capitalize on existing KPI models for location-based online voice assistants will consist in current users with accessible databases to plunder, whereas online website-based virtual assistants already have a KPI model emerging based upon the number of user-triggered interactions.
Key voice assistant KPIs:
We can safely assume that location-grounded voice assistant KPIs will want to focus on the following:
- Triggered interactions of a particular keyword phrase or known synonym
- Time spent in the interaction or activity triggered
- The trail of other content consumed from the initial content-interaction
- Times the “skill” was enabled on Alexa/similar systems
- Times the content was accessed via a particular identifiable app (Alexa, Google Assistant, Siri, Cortana, etc.)
How to assess your VUI needs (as part of content strategy)
Ask yourself why you might need a voice layer (VUI) for your content. The following are all good reasons.
You have *lots* of blog, videos, podcasts, PDFs or other textual or media content
You don’t have lots of content, but plan to
Competitive Intelligence reports show the niche is already adopting voiced content
Your niche competitors are already doing it (or are signalling that they plan to provide it)
It just makes sense for what you provide (niche podcast or thought-leadership blog)
It could extend reach by giving you accessibility your competitors don’t yet
A main way is that voice assistants are likely to be tied into a user profile, which is far more helpful for marketers and companies trying to act as concierge to their visitors. The user profile will likely remember the choice for future and more general commands, with by memory, or if the user has activated a go-to voice skill, such as the NPR daily briefing skill. So when a user asks for the news, the voice assistant will draw from that skill or another, based on previous use, or upon the non-skill method last used in many cases.
Speed is another crucial way that voice interaction with content is different. Consumption of content is instantaneous on a voice assistant compared to taking minutes to type (or text) out on a physical/virtual keyboard. Alexa, for example, will suggest something by playing it if your search criteria are either a bit general or there isn’t much to choose from within her database in that genre yet.
Partnering services play a large role in making your content available for search. So if searching for music on Alexa, you’re going to end up hearing what’s available on Amazon Music or in your purchased Amazon songs. The same dynamic will play out for Apple songs on Apple’s Siri and so on down the line. So in the end, there is much more of a VUI-tailored experience going on with voice, but often that’s not a bad thing. Instant gratification can justify less selection if it works for the user. But voice assistants can branch out beyond the branded box as soon as a mediator (like with Alexa “skills”) is used on the voice assistant to conduct a search or perform an action.
Challenges to designing a VUI
One of the key challenges to designing a world-class VUI has to do with content-creation culture at your organization. If your organization has taken pains to develop a content factory able to pump out the relevant content, there is likely an aging SEO outlook on content strategy. That’s not a bad thing, but rather a good start for a fresh new voice layer.
Stakeholders must be brought into a formal discussion about the importance of developing a voice-friendly content plan that includes VUI as a major source of navigation and accessibility. As accessibility standards continue to evolve, a VUI increases the likelihood of compliance with accessibility and SEO best practices in one stroke. Ask them what they care about. Make notes. And compare concerns among your stakeholders.
Design guidelines for voice user interface
- Ask stakeholders what channels you want or need your VUI to appear on
- Look at the content potential your brand is willing and able to commit to
- Make a list of what you intend to cover in terms of media formats and individual content streams
- Begin to plan out your content layers and interactions
- Text (it’s good to have your text for all content accessible in some base text format for voicing and search by voice assistants)
- Text spin-offs
- Audio content
- Individual audio files
- Video content
- Video content
- On-site media playback
- Video content
- 3-D content
- Moving GIFs
- 3D video content
- Augmented Reality (AR) content
- Virtual Reality (VR) content
The future of VUI for content-makers
Today’s SEO must take account of voice optimization for content going forward. It’s not enough to be found on search engines for text searches, voice searches will be a large share of all search as we move into the 2020s. From accessibility to SEO and social media implications down the road, it’s expected that VUI will further separate the prepared from the unprepared and impact marketing and business objectives as the user continues to evolve unabated.
Find out how to turn your website into a VUI-optimized website by downloading our comprehensive and informative Ebook on 360UX™ or by checking out our 360UX™ service page. If you feel you already understand the VUI aspect of the complete 360UX™ model (or have already read our Ebook), you may also contact us about taking advantage of 360UX™ services to start the conversation on behalf of your organization.