The Ubiquitous Personal Assistant: The battle has begun

Apple popularized the concept of a digital personal assistant with Siri. In The Software Society, I predicted the importance of an extension, “ubiquitous” personal assistants (UPAs), personal assistants that use advanced language understanding and carry their knowledge of you and your interests across devices (e.g., mobile phones, tablet computers, and PCs). I also predicted the growth of Direct-to-Content (D2C) results for search, rather than a list of options that requires further research to get to an answer. (Siri provides such quick answers in some cases; Google Now gives answers before you even ask the question.) In this note, I predict that Apple will extend Siri across all Apple devices (and allow a text entry option), making it a UPA, with the advantage that they can allow their personal assistant access to the apps on the device more easily than a competitor. On the other hand, Google has moved aggressively into the ubiquitous and D2C space with a combination of voice search and Google Now that now runs on Apple’s iOS as well as Android.

First, let’s look at Apple, which has faced skepticism that they can continue their run of breakthroughs. New Android phones such as the Samsung Galaxy 4 and HTC One are getting a lot of attention, while a hardware upgrade to Apple’s smartphone seems well off in the future, according to statements by the Apple CEO. The new Android phones have upgraded hardware, and will probably sell well, but some new software features such as gesture control in my opinion are largely gimmicks that don’t contribute much to the phone’s usability. In fact, they may actually frustrate a user trying to master the new modalities, which aren’t particularly intuitive.

Has the smartphone in general reached the point where improvements can only be incremental or minimally useful frills? Can Apple surprise again with a solution that innovates beyond the Android competition?

I believe it can. The solution is in improvements to Siri, the personal assistant software. Apple recently posted many job openings with “Siri” in their description, so there is objective evidence of this intention. The Apple Worldwide Developer Conference on June 10-14 is said to include an update of iOS and could be an occasion for announcing some improvements in Siri’s performance and features.

Siri can go beyond the obvious improvements in core technology, such as better speech recognition performance. It can introduce a new level of utility for digital devices in at least four ways: (1) helping us cope with the growing burden of managing tasks and social interactions; (2) helping us use the increasing complexity of features on a device; (3) making Siri available on all Apple devices, with a single identity that remembers actions and information on all devices (an Apple-specific UPA); and (4) creating, in effect, a “voice app” community for Siri, the ability to call up specialized voice-interactive personal assistants, e.g., company customer service assistants, experts in particular areas (food, travel…), or interactive ads/entertainment. The Software Society argued at length that these trends are inevitable in general and detailed their evolution.

The history of Siri is that it came out of a government research initiative called Cognitive Assistant that Learns and Organizes (CALO). The current version of Siri allows using the features of the phone more easily and provides search functions that attempt to give an answer rather than a list of web-site options when possible, but it has only a fraction of the features that CALO researched in the five-year project involving more than 300 researchers from 25 universities. One of the objectives of CALO that Siri could enable in an advanced version is organizing and prioritizing information that tends to overwhelm us each day, such as email, texts, phone calls, social tools, and tasks. The core technology here is “machine learning,” where the assistant learns from our actions what and who are most important to us. (Microsoft has made machine learning a company priority and has a history of innovative research in speech recognition, so expect competition from them in this area.) Personal assistants should be able eventually to handle tasks such as “list important messages” or answer, “what time slots are open for an hour meeting on Friday?” or “What movies might I like within ten miles?”

Specialized assistants called by a GPA can take advantage of a narrower scope.  (Software subroutines are an analogy.) The language and knowledge representation technology can be more focused in the special case, rather than attempting to deal with any request. Customer service apps for specific companies are not only cost-effective in reducing calls to call center agents, but could provide advertising revenues for the GPA provider that called them up. Apple could develop its own specialized assistants to capture ad dollars, e.g., providing discount coupons for local businesses, or extending Siri to help with iTunes and thus promote music sales.

A rumored watch-like extension of the smartphone from Apple and Microsoft (or other hands-fee solutions such as Google Glasses) will depend largely on the effectiveness of speech interaction and D2C, further making personal assistant growth synergistic with other initiatives. Typing on such a small device would be painful, and the whole point is to avoid taking the phone out of a pocket or purse for most tasks. The UPA would, of course, extend to those wearable devices.

Google has explicitly joined the UPA fray, with their recently announced combination of Google Now and voice search for iOS (already available on Android). The general press has been referring to the combination as a personal assistant named “Google Now,” but there is no direct input option (voice or text) to Google Now used alone—it proactively displays what it believes is relevant to the specific user (referencing a Google calendar app, for example). Google in a company blog implied that personal assistants were simply an extension of search, perhaps a position they are taking for future patent battles with Apple. Apparently, they want you to address the personal assistant as simply “Google.” They are moving to the D2C model as well, being able to answer questions such as “Do I need an umbrella this weekend?” with the forecast. Google search with Google Now can answer directly questions such as “Who’s in the cast of ‘Oblivion’?” or “Show me nearby pizza places.”

Other companies can, of course, update their personal assistants similarly (and will add at least some of these features). It doesn’t require a new phone to do so, since most personal assistant functionality is implemented in the cloud. Apple’s main advantage is that it has created a community of devices and software tools delivered with those devices, and can more easily implement a ubiquitous personal assistant for that community. In addition to the companies mentioned here, Samsung and Nuance Communications have teamed to produce S-Voice, a Samsung personal assistant for Android phones using Nuance’s language technology. (The Software Society discusses the power and importance of “technology communities” in one chapter.) For Apple, the new Siri can promote their entire hardware and software line by being a unifying influence motivating loyalty to Apple. A further advantage is the large amount of data on what users say that Siri has provided; language technologies largely work by extracting consistencies from large volumes of data—machine learning. Google has less data on voice interactions, but more data on what people search for and how to find it. Microsoft’s advantage is that many people use their Office applications and they have a dominant share in PC operating systems; they have the core technology, but are running behind in the mobile space. Nuance is an option for companies such as those that make mobile phones like Samsung, with their large portfolio of language technologies and patents in the area, and there are other smaller companies trying to enter the race (at least for the specialized versions), a number of which spoke at the recent Mobile Voice Conference I co-organized.

Do you want what the UPA can become? If the answer is yes, and you suspect most people will respond similarly in the long run, then there is a need that companies can profitably exploit and are competitively motivated to do so. The technology to support that goal is beyond the “tipping point” and improving. It will happen.

