Practical Artificial Intelligence
There’s an old saying in computer science circles that when we have no idea how to make a piece of software do something smart we call it “Artificial Intelligence” but once it’s solved we look back with 20-20 hindsight and say it was “Software Engineering”. A computer becoming the world chess champion is the quintessential example of this. Once considered a holy grail of AI, by the time Deep Blue actually dethroned Kasparov, the computing world yawned, “Oh it was just brute force computing power, nothing truly intelligent is really happening”.
Beating the world champions at Jeopardy was slightly more interesting because we acknowledge the vast range of knowledge and language understanding involved. But ultimately, since Jeopardy is just a game, we are left with the feeling, “so what?” How does this affect my life one way or another? Enter, Siri, the voice recognition system integrated into the new iPhone 4S.
When I heard about the feature and saw what it claimed to do, I was intrigued, but figured that the claims of how well it worked were really exaggerated. After all, it’s easy to program for requests that you can anticipate and individual speakers’ accents and colloquialisms. Practically speaking (I figured) it would be no more useful than the standalone Siri app you could download, or the Google voice recognition system. But after experimenting with it for a few minutes, I was blown away. Not only is the syntactic recognition way more accurate than anything I’ve ever seen, Siri exhibits an understanding of the semantic and pragmatic like no other system I’ve seen, except perhaps google’s search engine itself.*
Sure it was fun asking “Who’s your daddy?” and getting a different reply based on the gender of the speaker, or asking Siri what (s)he’s wearing and getting an appropriately evasive answer. But my real aha moment came when I got an email from a friend suggesting we chat next week sometime. Normally this kind of thing would either clutter my inbox until I created an appointment in my calendar or next Friday rolled around and I realized I hadn’t responded in a socially appropriate timeframe. I was curious how Siri would handle the request, “Remind me to call Mark on Wednesday”. Not only did it repeat back my request verbatim, but it created a reminder for 9am on Wednesday using a new Reminder App bundled with iPhone that I didn’t know about. Most importantly, it gave me a very real peace of mind and reduction of anxiety that I haven’t been able to get from any of the dozens of productivity apps or systems I’ve tried, like Evernote and Omnifocus.
This Quora article outlines the significance of Siri’s great leap forward for AI, pointing out that voice represents the “4th interface” for human-computer interaction (the first three being keyboard, mouse and gestures). The real key though is that Siri actually does things of practical value for a mass audience. In other words, Siri helps you get things done that you want done. This is new. And huge. And just the tip of the iceberg.
The article ends with a suggestion that we should start thinking about new uses for the technology because eventually there will be an API that allows app developers to make use of the power of voice interaction. This fact is inevitable, even if Apple and Siri are not the ones that will do it. The AI genie is coming out of the bottle.
So, my question to you, is what’s the next killer app that utilizes voice interaction? What would you like to see personally?
* Syntax refers to the form of a sentence, i.e. what words are being used and do they follow the rules of grammar. Semantics refers to the literal meaning of the sentence (e.g. “Does anyone know what time it is?”) while pragmatics refer to the underlying intent of the speaker (e.g. that the speaker wants you to tell them what time it is, not whether there exists someone who knows the time).