With Apple’s launch of the iPhone 4S last month, the paradigm of human-computer interaction saw what will become the largest shift since the original iPhone brought capacitive touch screens to the masses. In a single decade we have gone from typing on a set of static keys on a phone, to interacting with an adaptive touch screen display, to now being able to have a conversation with our phone. This transition reflects the desire of people to interact with technology as we do with each other. Touch brought our gadgets and technology out of the realm of plastic devices into objects of desire with emotional value. We bring our smart phones everywhere, sleep with them in our beds at night, and caress their screens until they give us what we want.
The touch revolution helped make interactions with technology more intuitive, more responsive, and much more powerful. A single pane of glass with an adaptive user interface allows developers and designers to create a custom environment for each situation the user is in, so that content can become the focus of applications instead of menu placement and UI design. Users no longer have to worry about where the ‘copy and paste’ command is in the new MS Office ribbon, and instead can focus on making their creations look, read, and feel the way they want them to. This transformation also allows designers to create incredible experiences that replicate real-world instruments, allow the user to simply navigate to an address by tapping on a map, and rotate an image by manipulating as they would an actual photograph. So if touch is so great, why did I say that the introduction of Siri will change the way we interact with technology?
For as long as we’ve had computers, we’ve dreamed of interacting with them as we do with each other, through conversation. One needs only to look at science fiction classics like Star Wars and of course the infamous Hal from 2001: A Space Odyssey. Conversation, whether verbal, textual, or ultimately through body language, is how we are able to communicate most effectively. While voice-to-text technology has existed in rudimentary forms since the 50′s, Siri is the first “digital assistant” to have an actual conversation with you. A conversational interaction is far more powerful than existing speech recognition command interfaces like those found in Google’s Android or Vlingo’s smartphone software. Instead of issuing a keyword, “navigate to”, and a search term, “Starbucks”, the user can now converse with the computer to narrow results, select the best option, and then act upon the information.
With computers, smartphones, and tablets now shipping with much more hardware processing power than the average user will need, designers and developers can focus on changing the interaction paradigm and desktop metaphor we’ve adjusted to over the last few decades. In a fantastic interview with The Verge, Android designer Matias Duarte talks about designing interfaces that reflect the uses and realities of technology, instead of building on archaic metaphors that no longer apply. Touch was the first milestone in enabling this UI revolution, but speech recognition will take us even further. With speech, our technology begins to have a “soul”, a spirit, an attitude even. Siri’s designers understood this and instead of simply building call and response into the engine, they gave Siri attitude, writing custom responses that created a personality for the machine. As computer synthesized voices become more natural and speech recognition algorithms get better at detecting the meaning behind natural language, the interactions we have with technology will continue to blend the lines between human-human interaction and human-computer interaction.
So where does this leave us? Children being born now will never know a world where speaking to your computer labeled you as a geek who didn’t get out enough. Interacting with computers will continue to become more and more natural as speech recognition matures and computers become better at reading facial expressions and body language. While Minority Report set what has become the gold standard for futuristic user interfaces, the reality is far simpler. Complex hand-waving and gestural cues will become obsolete as we talk to and interact with technology as we would with our friends. This may seem like a far-off vision of the future, but as Apple demonstrated with their “Knowledge Navigator” tablet concept in 1987, technology that seems as far out as flying cars can be attained in mere decades. Touch helped us connect with our devices and speech recognition will help us build relationships with them, bringing our science fiction dreams ever more close to reality.