A lot of folks are skeptical about the future of speech as a computer interface and they even try to think of what will replace the good ol’ mouse. And although I agree with them in certain cases (excessive chatter in the workplace being the biggest bummer), I believe they are missing the point here. First of all, it is important to note that natural language-based interface is a broader concept and doesn’t have to involve voicing the commands. Second, your computer system (it could be something as small as your mobile phone) doesn’t have to listen to your commands all the time—you push the button and then you speak your command or you receive the call and then you say “busy” or “hello”. Your Smart House doesn’t have to listen to you all the time in every single corner of your mansion—you push the button on the remote and speak into the tiny microphone imbedded into the remote. Your car doesn’t have to listen to you all the time—just push the button on the steering wheel and say “directions”. And that’s all been done already and it’s working.

On the other hand, how do you know when to listen? You detect some motion towards you, someone calls your name or turns their head in your direction, you hear a pitch and/or amplitude increase in their voice, etc. These are all the “buttons” that other people push to attract our attention before they start speaking to us. Sometimes they fail—I am sure all of you had this experience when someone talks to you and you do not recognize the fact immediately and when you finally do it’s too late and you have to ask them: “are you talking to me, could you repeat that please?” So, my point is that pretty soon computers will come to be just a little bit better at reading those cues and speech recognition will become ubiquitous. These things don’t have to be complicated to be useful.

November 29, 2005 |

