Voice interfaces may end up being tech’s next big thing – or at least fill some useful niches, writes David Walker.
Everyone in tech wants to be Captain Kirk. Having made real Star Trek phones, electronic pads and wireless earpieces, the technology world is now busy trying to recreate those scenes where Kirk asks “Computer, get me information on...” Voice is the current Next Big Thing in computer interfaces.
You’ve already used primitive voice interfaces for years if you’ve dealt with automated phone systems. The latest systems, dubbed ‘voice first’ devices, are a huge step up; they use microphone arrays and technologies like natural language processing not just to recognise short phrases but to hear and interpret whole sentences, and then provide responses.
They pop up in smartphones – Siri for iPhones and Google Now for Android. They’re also in standalone household devices like Amazon Echo and Google Home, both available in the US and both expected to be available in Australia within the next year or 2. These devices can already set your morning alarm, tell you the weather without fuss, play music from Spotify, set a kitchen timer and dim a wi-fi controlled lighting system.
The technology is creeping into apps from firms like innovation-savvy Spanish bank Santander, too.
The standard prediction – perpetuated by technology thought leaders like Mary Meeker – is that voice interfaces will soon become so capable, cheap and easy to use that sales and usage will explode. Consultancy VoiceLabs predicts almost 25 million of the voice-driven standalone household devices like Amazon Echo and Google Home will ship in 2017. The Gartner technology consultancy predicts that by 2019 a quarter or more of developed-world households will use voice interfaces as the primary connection to home services like lighting and heating.
Apple, Google, Amazon, Microsoft, China’s Baidu, speech-recognition firm Nuance and a host of smaller players are all now racing to make voice interfaces do as much as possible. They’re trying to do it quickly because a lot of people suspect voice interfaces are going to end up dominated by just a couple of players.
How important voice interfaces will become is still unknown, though. You want your assistants to be reliable almost above all else – and right now, talking to your slightly flakey computer assistant can be pretty frustrating. Even after attempts to train the software, I still can’t even reliably ring my partner hands-free from my car, let alone use a voice-recognition program to dictate this article. It’s 95% there, but the gap between 95% and 99.5% is vast.
One big problem noted recently by venture capitalist Benedict Evans is that when you give instructions, it helps to know what your options are. A screen does this much better than a talking device.
WHAT DO WE WANT?
Natural language processing!
WHEN DO WE WANT IT?
Sorry, when do we want what?
— Benedict Evans (@BenedictEvans) January 22, 2017
Maybe voice interfaces will fill some interesting niches, like smartwatches did, rather than becoming central to the technology landscape. Or maybe we’ll soon notice that voice interfaces are opening up options we never imagined, like the iPhone did.
The bigger lesson of voice assistants and smartwatches and all sorts of new devices might be this: the world is now making a huge variety of processors, sensors and input and output devices, and we’re in the process of finding the niches where they do useful work. There are always lots of claims and plenty of experiments, and it will be fascinating to see which ones change the world.