Voice commands and voice recognition for smart phones
Dom | On 08, May 2012
Talking to your mobile is a bit weird, right? No no… not “into”, that’s what phones are for – remember – talking to each other?! I’m talking about voice recognition and voice activated commands for smart phones.
In the past few years we’ve seen honks of these products coming to market, much to the delight of profiting companies like Nuance who dominate the voice recognition world. Though actually, they’re all free services so I have no idea how they’re making any money! Anyway… a brief history: Google came along with their voice recognition powered search, then Samsung teamed up with Vlingo, Apple fought back with “Siri” which was quite popular because it talked back and made people think they had a robotic friend. Now Samsung have updated themselves to “S-Voice” to go hand-in-hand with their latest device: the much anticipated Galaxy S 3. But how useful is this technology? Is it just a gimmick, and if not, then why don’t we see people using it more often?
Until recently, voice recognition on smartphones has been… well… abysmal. It hasn’t really get things right. OK, sure. We’ll let it have some mistakes. Heavens I mishear people all the time but until it’s up to par no-one’s going to be interested. At the end of last year I tried to make a video demonstrating the voice recognition of the Galaxy Nexus with my friend Kin-Hing. It went something like this:
Kin-Hing: “Hi Dom, How are you enjoying your Galaxy Nexus”
Google Voice recognition: “Hi Dom, How are you enjoying your penis“
Me: *fits of laughter*
This is ridiculous. I thought that these systems were supposed to be “context aware” so I can only assume that Google used the shadier side of the Internet to gather data for its setup. Embarassing much?
How can we take something seriously and use it on a day to day basis when it doesn’t even work properly!? Sure it’s a great idea to be able to dictate to your phone and have it write and send a message for you, but what’s the point if it doesn’t even work properly? Well, *these days* it is getting better and better. With the likes Google harvesting data from its millions of users and getting us to self-correct false voice-to-text interpretations, voice recognition is coming on leaps and bounds. Google now even offers a continuous voice-to-text service as a “keyboard input” service. This means that a word is no sooner out of your mouth than written on the device in your hands… this is impressive, scarily impressive. Within the next couple of years voice recognition will be so good I wouldn’t be surprised if it could finish a sentences for me!
What’s still getting in the way, however, is the whole notion of talking to your phone rather than through it. It seems to be somewhat taboo in society, even though we’ve had a couple of years exposure to it. Voice commands on smart phones is something you show to your friends at the pub but never really use in daily life. It feels strange, talking to a rectangular pocket sized robot – especially in public. That being said 10 years ago most people would feel strange talking on their mobiles in a public space where they could be overheard.
Today things are the complete opposite. People walk around the streets talking on their mobile phones for everyone to see, and hear. It’s like they’re putting themselves on display! There’s something called the “Lombard effect” which results in pretty much everyone talking louder on their mobile phone. Even louder than if they were talking to someone sat next to them. Because of this there is a think there is still a remnant of negativity toward “those people who talk loudly on their phones in town and on the train”, as my mother calls them. But my mum is just as bad – she just doesn’t realise! So if we’re all doing it, it’s socially acceptable and that yuppy spurred ‘taboo’ is now the norm, people will talk about all sorts of personal things in public… so how long until it’s OK to talk to your phone?
“Computer, are we there yet?”
If Science fiction predicts anything, talking to phones and computers will be common-place by the time we are flying round the Galaxy in enormous space crafts. Given that the Space programme for this world isn’t going to reach that far for some time I’d say “somewhere in the next 5 years”. Why? Because we’re at a stage where it is efficient, accurate and good enough to make things easier for us, rather than more complicated.
Apple’s Siri, Samsung’s S-Voice and Google’s whatever their voice service is called all function extremely well, though In my opinion S-Voice has set the mark for what we expect in the future of voice controlled devices.
Rather than being limited to a rigid structure of commands – which we see in other voice-command systems, S-Voice takes much of its interpretation power from the Wolfram Alpha search engine. Wolfram Alpha has an extremely clever way of working out what exactly you are asking it.
I can type into my browser things like “what was the weather like last Tuesday” and it will immediately present me with intelligently sourced information about the weather conditions in London (my current location) on the 1st of May (last Tuesday). What’s really cool is that if I ask my phone this question, it will pull down results directly from sites like Wolfram Alpha and display them on my Galaxy S 3 without me even having to open a browser. I can take this one step further by asking it things like “what is the integral of x cubed” and it will automatically plot me a graph and write out the indefinite integral as a formula, both of which I can then copy to elsewhere. Come on … that’s useful stuff. Why is it useful? Well if I wanted to find that information out on my computer, I would have to:
Step 1: Turn on the computer
Step 2: Open a browser
Step 2: Navigate to Wolfram Alpha
Step 4-35: Type in each character of “what is the integral of x cubed”
Step 36: Press Enter and see results
on my Galaxy S 3:
Step 1: Turn on phone
Step 2: Double press the home key
Step 3: Talk at it naturally
Step 4: See results
Well I know which solution I prefer, but I still feel a bit strange about asking my phone what the weather’s going to be like tomorrow in public.
How about you? How long do you think it will be before talking to our phones becomes a natural thing, or do you already?