22 04 2013
Mobile personal assistants – Rapid expansion of options and features revealed at April conference
I was a co-organizer of the Mobile Voice Conference, held April 15-16 in San Francisco. As the conference title suggests, the focus of talks was commercial developments where voice technology such as speech recognition was used with mobile devices. The combination allows more intuitive and constant connection to computers, a major trend analyzed in The Software Society. In particular, personal assistants, with Apple’s Siri being the best-known example, are a key part of this trend. Talks at the conference showed the speed with which this trend is evolving.
In a panel at the conference, Adam Cheyer, co-founder of the company Siri (and a director of engineering in the iPhone group after Apple bought Siri) pointed out that Siri did much more than most users were aware of. He showed how Siri remembered context, so, for example, one could ask for french restaurants in Palo Alto, and then say “Make a reservation,” without repeating search criteria. He also demonstrated that Siri could engage the user in conversation to clarify an ambiguous request or to offer options (“When is the reservation for?”). He also noted that there is a small information icon near the Siri button that, when touched, gives examples of what Siri can understand and act upon, e.g., “What is Apple’s stock price?” (Almost no one in the audience of technologists was even aware of this info button.)
Other available General Personal Assistants (GPAs) were discussed at the conference. Tareq Ismail, User Experience Lead, Maluuba, discussed the company’s voice-enabled personal assistant for Android. Andy Peart, Chief Marketing Officer, Artificial Solutions, discussed the company’s GPA as a “ubiquitous personal assistant” available across different devices, a term and trend discussed in The Software Society. Gregory Pal, vice president of marketing, strategy and business development, Enterprise Division, Nuance Communications, discussed a similar topic: “The Next Big Thing: Virtual Assistants In the App and Across Devices.”
Glen Shires, Software Engineer and Speech Specialist, Google, discussed how the Google Web Speech Application Programming Interface for the Chrome browser allowed adding interactive speech features to a Web site, leading to speculation during questions by the audience that Google would add a GPA feature to Chrome. Adding speech to a web site has the advantage that the voice interaction is available on any device with a browser (avoiding the need to download an app).
Outside the conference, it was revealed that Amazon had bought a company, Evi, with a GPA. And it was announced that Sherpa, a GPA popular in Spain, was coming to the US, with interaction in both English and Spanish.
Specialized Personal Assistants (SPAs) were a major topic at the conference, with much of the focus on company-specific personal assistants that provided services associated with calls to customer service lines. Stuart R. Patterson, CEO, Xtone, said, “App-specific UIs, enhanced with the power of voice, will work even better than Siri.” Xtone adapts standard mobile apps of any sort into a voice-interactive SPA. Vishal Dhawan, Chief Technology Officer, Xtone, spoke on “Enterprise Mobile Voice Assistants.”
Mark Hanson, Senior Strategist, Strategy Design & Innovation, Nuance Communications, spoke on “Driving the Customer Experience with Next-Generation, Speech-Enabled Mobile Apps.” Nuance provides Nina, what might be thought of as a template for a company-specific personal assistant for customer service that can be adapted to a specific company. USAA is one company that has done so with a banking personal assistant, and Nuance said that at the next Mobile Voice Conference they wouldn’t be surprised if they have permission to announce twenty more large customers.
Dmitry Sityaev, Senior Speech Scientist, Angel, spoke on “Deploying Specialized Personal Assistants.” He demonstrated how easily mobile applications can be created and updated using Lexee as well
as underlining some challenges encountered when deploying personal assistants. Dave Rich, CEO, LumenVox, a provider of customer service solutions, spoke on “The Personal Assistant of Tomorrow.” He said that we are “finally seeing the results of new applications and actionable meaning that is being derived from the escalating volume of mobile data.” Farzad Ehsani, CEO, Fluential, described the “Virtual Travel Assistant” the company had developed.
Bachir Halimi, President, Speech Mobility, discussed how his “Speech Mobility Assistants” could convert a standard phone system into a personal assistant for callers and employees. Halimi noted, “Specialized (business-specific) and Personalized Virtual Assistants will sell better than general ones that do ‘Everything for Everybody’ simply because assistants do a better job when the need, context, and user profile are known ahead of time.”
Samrat Baul, Director, VUI Design and Analytics, 7, discussed another SPA solution that doesn’t require downloading an app. The company’s solution can convert a standard customer service call into one that uses a mobile phone’s browser in combination with the voice call to create a visual interface along with the voice interface. Valentine C. Matula, Senior Director, Multimedia Technologies, Avaya Labs, noted that the standards HTML5 and WebRTC would enable automated and agent-based customer service directly from within web and mobile thin clients into contact centers. Nagesh Kharidi, Technical Director, Openstream, discussed building “context-aware multimodal mobile apps” using W3C open standards.
The Software Society claims (and I said in a talk at the conference) that every company would eventually need a SPA as much as they need a web site today. Shai Berger, CEO, Fonolo, spoke on a similar theme: “How Mobile Apps Are Saving the Contact Center.”
Peter Voss, CEO, Smart Action, referred to SPAs as “siblings” of Siri, and discussed the company’s “Smart Call Agents,” a natural language call automation solution. Voss also noted the potential for “local” SPAs that could, for example, help one locate merchandise in a store and provide information on specific products as the customer viewed them.
Bradford Starkie, Chief Scientist, Gazunti, discussed the company’s “knowledge navigator framework,” an “open framework for Mobile Voice” using speech, text, smart phones, or browsers. Starkie said Gazunti can answer questions like “Do you have a store in Santa Cruz?” or “What are my rights as a creditor when a company is in external administration?”
Amy Neustein, CEO and Founder, Linguistic Technology Systems, described methodology for making a personal assistant aware of the emotional content of material it uses to provide answers, speaking on “The ‘In Touch’ Personal Assistant: Next Generation Emotionally Intelligent Mobile Devices.”
Jeff Rogers, vice president and co-founder, Sensory, spoke on the company’s technology that allows “waking up” a personal assistant (or other application) with a spoken phrase without having to push or touch a button. He highlighted the capability by wearing Google Glasses (which have this feature) on a conference panel, an example of wearable computing making assistant functionality even more constantly available. Other wearable computing alternatives are watch-like devices rumored to be under development by a number of companies. Rogers said, “A voice trigger combined with a small set of commands that all run on the local device are key to success of speech in mobile products as seen in many of the Samsung phones/tablets, Google Glass, Bluetooth products, etc.”
The need for an option that minimizes distractions while driving is further motivating the concept of voice assistant functionality. Michael Metcalf, CEO, Voice Assist, spoke on “Increasing productivity with a personal assistant while complying with hands-free driving laws.” Greg Aist, Staff Software Engineer, Telenav, spoke on “Creating a hands-free, multi-modal, mixed initiative speech experience for automobile navigation.” Thomas Schalk, Vice President, Voice Technology, Agero, cautioned on a panel that the car environment is very difficult for speech, and that a solution that included at least a push-to-speak button and probably some visual elements on the dashboard might be required for effective driver interaction.
Personal assistants need not just be reactive to requests. Like Google Now and Microsoft’s live tiles in Windows 8, they can anticipate what information you might need before you ask for it. Marsal Gavalda, Director of Research, Expect Labs, spoke on “real-time, context-aware anticipatory search.” Sunil Vemuri, Co-Founder & Chief Product Officer, reQall, described the company’s “reQall Rover,” a platform for “context-smart proactive personal assistance.”
Julia Webb, VP of Sales and Marketing, VoiceVault, highlighted the potential for a personal assistant (and any other phone-based transaction) to be secured by what are popularly called “voiceprints.” She announced during the Mobile Voice Conference an important milestone for voice biometrics in mobile applications. A top three global US financial institution launched an app that employs VoiceVault’s voice biometric technology as an authentication factor for wire transfers and Automated Clearing House payments; she indicated that this is the first time that a mobile banking app with voice biometrics has been made available in 40 countries simultaneously after passing a stringent legal and privacy review.
This discussion has focused on personal assistant functionality covered at the conference. Many other talks at the conference were relevant to this subject. For example, there were discussions of improvements in core technologies of speech recognition and text-to-speech synthesis.
All of this activity suggests that the trend of closer connection to computer intelligence and the evolution of personal assistants discussed in The Software Society are progressing at a remarkably rapid pace. There is little doubt that the trends are broad-based and driving a new phase of innovation in technology.