Thursday 20 April 2017

Virtual assistants: The IoT trojan horse


Cable Satellite International

While consumers are latching onto Alexa as their voice-based search engine, operators are eyeing wider voice control of IoT devices in the smart homes.

With greatly improved voice recognition and the evolving ability to understand accents and natural language, operators are putting voice control on the fast track, initially for surf and serve of video content with a longer term strategy to make voice the de-facto UI for hooking up multiple devices and services in the smart home.

“2017 is the year of the voice assistant (VA),” declares Sylvain Thevenot, managing director at Netgem. “After the remote control, the interactive menu and the smartphone we think it’s time to move to the next level with voice.”

“This year will see voice come to the fore,” asserts TiVo’s senior director of international marketing, Charles Dawes. The world’s largest operators are at it too. Telefónica took to MWC 2017 to unveil a new digital assistant called Aura, the culmination of a two-year research project.

Aura works much like Apple’s Siri or Amazon’s Alexa, allowing customers to check on details of their Telefónica service, and ask for problems to be resolved or new features to be provided, using a voice interface on a mobile device.

The main reason for the latest interest in speech recognition is that the technology has advanced sufficiently to the point where it is usable. A glance back at the history of the technology finds IBM claiming error rates of 43% in 1995 advancing to 6-7% today. The goal, well within reach, is to achieve 4% which is the same error rate humans exhibit when understanding speech.

The challenge is creating software that can tell the difference between ‘pizza’ and ‘Pisa’, something that requires contextual knowledge of the differences. “In our tests, voice assistants work. They work in multiple languages and with different accents,” says Itai Tomer, head of cloud DVR, Ericsson. “The only problem we might encounter today is while watching TV with a friend or friends Alexa doesn’t necessarily know when you are referring to it.” For the success of voice supported UIs a number of aspects need consideration: the location of

the household to predetermine the language, for example, and the identity of each user in the household. “There are some pioneering examples in Switzerland [Swisscomm], where voice search is supported in German, all Swiss German dialects and French in order to allow a more engaging user experience,” reports Ferdinand Maier, CEO, ruwido. “In Switzerland where someone from the Italian-speaking part is married to a Swiss-German speaking person and living in the French-speaking region, the system has to be able to recognise three different languages, even if the household in terms of geo-localisation is French speaking.

Dialects and accent ID

“Accents can differ from one valley to the other, or one city to the other,” Maier adds. “This complexity illustrates the need to continuously improve recognition, and the more data that is gathered the better the results are.”

Futuresource makes the point that the performance of a voice UI is about more than the performance of the engine.  Form factor (usage of a VA-speaker device is considerably higher than on mobile phones); situation (usage in the car and home is much higher than on mobile devices) and other technical features associated with the device (eight long-range microphones and ability to use trigger words) are key for the hands-free smart-home application of Voice UI. Pay-TV technology developers routinely refer to Amazon Alexa despite the other major VAs using cloud-based AI to provide a voice-based interface - Google Assistant, Siri for IoS and Microsoft Cortana. “Amazon has made the Alexa API very easy to integrate and work with,” explains Tomer. “The barrier to entry is quite low.” Provided a device has a mic, internet connection and speaker a vendor can incorporate Alexa. Consumers don’t have to buy an Echo Dot.

“Integration is so easy that we are seeing the good, bad and ugly of applications,” says Thevenot. “It reminds me of the early days of mobile app stores where most apps were useless, 10% generate value and only really 1% are used by the mass market.” Of the 1 percent with genuine value, Thevenot says these are applications which are not trying to replace “what someone could do with a touchscreen or remote control.”

These are “useful but not good enough,” he says. “Is it useful to ask for a weather forecast so you don’t have to open your phone? Yes, but it won’t change people’s lives. Moving a level beyond this is where the benefits become very interesting.” Vendors are starting from the premise that consumers will adopt voice control because acting hands-free simply makes life easier.

“It replaces the need to type,” says Tomer. “People are used to managing their TV experience with a very limited set of icons or buttons. Voice commands for ‘volume up’, ‘volume down’, for example, mean they don’t even have to search for ‘search’ by clicking through the interface or and the painful experience of typing letter by letter.”

Nagra believes we’re en-route to “flawless interaction” with our smart devices. “Simple voice commands like ‘Go to CNN’ work, though there is certainly more work to be done to get UIs and artificial intelligence fully integrated,” says Anthony Smith-Chaigneau, senior director, product marketing. “To do this will require steps to enable conversations with and between TVs and STBs that go beyond simply commanding a machine to delivering a full AI-based user experience.”

Command control

While voice UIs aren’t at the levels that we see in sci-fi movies, where full conversations are possible, the industry is moving beyond basic command functions, toward more sophisticated capabilities for navigation and discovery. “Simplifying the experience is the first step to changing the discovery experience and giving people access to more channels, more content,” says Dawes.

Netgem has partnered with Amazon to create an environment where client services and content libraries connect to Alexa. Thevenot describes three layers of interaction possible in its new on-premises hardware SoundBox. “The first layer replaces what you do with a remote by voice commands. In fairness, this is not much smarter than having a remote on your mobile phone.” The second layer adds more value by performing improved search and recommendation in the cloud via Netgem Home Platform. It is the third layer which whets his appetite. “This is where we can deploy cool features – things you couldn’t do otherwise.”

An example: when a user sees an actor in a film but can’t quite place their name the voice UI is able to call on face recognition linked to the Cloud to deliver the answer. “You could do it today with a manual internet search but it takes time. This way is much easier and one of a range of possibilities for voice UI.” While not yet at the stage “where it can understand sarcasm”, in Dawes’ view the AI is on its way to be a TV pal you can consult with.

“It’s about facilitating a Q&A process where you are able to start off very broadly and be able to narrow down to find the content you want since the system comes back with intelligent responses,” he says. A search could begin with the command ‘Show me all the James Bond films’ and from that to ‘Show me the older ones without Roger Moore’. “The system would start to understand what ‘old’ means compared to ‘new’. It would also file away knowledge about your personal choices so each time you use the system its responses are more fine-tuned.” Holding a string of responses to questions in one’s head could be an issue, though. Given this limitation and the ability for humans to process dense information from multiple (visual/audio/ haptic) sources simultaneously mean voice is logically a conduit to more complex UIs.

Voice and visuals

It’s why vendors are trialling ways of supporting voice information with visuals and why VA speakers, including Echo, are expected to launch with touchscreens this year. “Voice may become ubiquitous but voice alone will not be the complete UI. Keeping interaction with the voice assistant is preferable so long as the responses are short and snappy and typically list not more than five options,” says Thevenot.

“Another option is to talk to the voice service through your phone and have the results displayed on the screen.” “Voice has limitations when it comes to browsing – because we can only absorb voice in a linear fashion, unlike with our eyes,” says Futuresource senior analyst Simon Bryant. “A combination of voice and visual aids would make sense.”

The migration of certain functionality to voice will certainly be a key feature for many operators in future, however, it is unlikely to completely replace a physical remote (or perhaps gesture control?). “For example, browsing through content/ channels to choose what to watch will be more effective with physical buttons than continuously using voice commands - particularly in linear channel environments which still take the majority of viewing time,” observes Bryant.

According to ruwido, voice can be seen as a fantastic supplementing interaction mechanism, but in combination with high quality button-based input or more natural interaction mechanisms, such as organic haptics.

“Voice is great for searching for known content, or for example for user identification or personalisation,” says Maier. “But for surfing channels or for turning the volume up or down, continuous input is more comfortable to use than to shout at your TV ‘volume up, volume down’. We see multimodal input devices, which fuse different interaction mechanisms, as the future for a more natural TV interaction.”

The pay-TV opportunity

In a piece of serendipity for which revenue starved pay-TV operators may thank their lucky stars, it turns out that they are in prime position to take advantage of the possibilities voice may open in the home.

“The TV has been part of the living room for decades and is the gateway for the Internet of Things,” says Tomer. “Why have one remote for TV and another for air conditioning? If a consumer can use the same UI to manage all their devices that is surely preferable. Introducing new IoT devices to the network then becomes frictionless.

“We believe if telcos start to push the smart home they will take subscribers with them. One reason is the brand trust that telco and cable operators have with subscribers. They are already delivering a service and can provide smart home controls as part of the package.”

Getting consumers to buy into smart home devices for lighting, heating, security and other functions has been fragmentary to date. Voice controlled virtual assistants could be the catalyst for change.

“The increasing number of connected devices in the home don’t tend talk to teach other,” says Tomer. “Voice can provide the glue between all these elements.” An example is speaker system Sonos integration of Alexa control.

Netgem’s SoundBox aims to converge voice control on a user’s video and music content but has other IoT applications in its sights. Tomer suggests there are at least 16 systems that operators have in place today that can be adapted for smart home services. These include control of security or baby monitoring cameras – or anything to do with video, content encryption and streaming – which an operator could bolt onto existing channel packages. Explains Tomer ,“If I can schedule TV content to record in the future I can use the same back-end scheduler for consumer control of heating or when the clothes dryer starts. With these tools, telcos are well equipped to be leaders in smart home.” All of this opens up a debate about how much data consumers may be willing to part with for perceived benefit or service discounts.

Privacy concerns

Tomer doesn’t think subscribers will care too much. “If the TV knows by watching you when you fall asleep so it shuts down, or analyses your behaviour to ‘see’ when you are bored and suggests another programme, or automatically pauses if you step away from the couch - these could be perceived as beneficial in exchange.”

 Indeed, Ericsson has developed such a feature but has yet to release it because of concerns about privacy invasion. “In 3-5 years the whole notion of violating privacy will fade.” The hold up? “Nobody wants to be the first to record constantly.”

Research from Parks Associates shows that a quarter of U.S. broadband households are willing to share data from a smart home product in exchange for technical support benefits, such as warranty information, product updates, or tips on how to better use the product. Nonetheless, it’s up to pay-TV providers to consider the legal ramifications and management of collected data to ensure it is stored and used correctly.

“When they can prove to consumers that the collection is safe and reasonable, and that voice control can deliver real benefits, we’ll likely see significant growth in take up of voice control services by subscribers,” says Smith-Chaigneau.

California-based audio recognition company SoundHound, which raised $75 million to build speech recognition AI-based platform, Houndify, says it will not “own the users or the data” of developers that build apps or devices using its technology. “We don’t have an agenda to hijack your product,” says CEO Keyvan Mohajer. “If you use Amazon, you lose your brand, your users. You have to ask your user to log into their Amazon account, they have to call on Alexa, and all the data belongs to them.”

With speech set to become the primary communication interface between the user and connected devices, the wider business issue is ownership of the end-user. It’s a battle which is as disruptive to the to the existing TV value chain as much for TV OEMs as well as MVPDs. Could content recommendation specialists, for example, be usurped by dominant virtual assistant / data aggregators like Amazon?

“In the short term it’s a threat because search and discovery is based on visual stimuli on mobile/PC devices (banners and EPGs),” finds Futuresource’s Bryant. “Longer term we expect voice and visual UI’s to integrate.” According to Maier, VAs aren’t replacing recommendation engines, but collaborating with it. “VAs help the user to achieve a specific task in an effortless way,” he says. “The user asks the VA to recommend content, by talking to it and not using a graphical user interface.”

“Realistically, VAs will most likely need to operate alongside existing recommendation services so users can choose the way they access content in a manner that works best for them,” finds Smith-Chaigneau.

Who owns the data?

For example, [AT&T-owned] DirecTV is using Alexa as the UI while retaining the existing logic of recommendations based on their library from the ContentWise platform. What about TV makers whose margins are already as thin as their bezels? Sony for one is introducing Google Assistant into its new line of smart TVs.

“Sony knows very well that, in the long term, integrating Google Assistant into their TVs may be a deal with the devil,” says Joel Espelien, senior advisor for analysts The Diffusion Group. “Nevertheless, Sony’s business model (along with its competitive vulnerability in the TV market) leaves the company little choice.”

Espelien suggests that TV makers could coalesce around Google as the de facto default search engine, the benefit being that it creates a level playing field for search. However, such a move is unlikely given it would “essentially be handing Google the keys to the $70bn U.S TV advertising market.”

They could integrate their own voice technology – such as using Nuance which currently supports a number of pay TV operators. However, it’s fair to assume the companies with access to the most data and best AI capabilities will improve more than those without. More likely is an approach which acknowledges that the big four (Alexa, Siri, Cortana, Google) will dominate the market and so OEMs and MSVPs will align with one or more of them accordingly – letting the consumer decide which VA they favour.

However, virtual assistants are a “threat and concern” for some service providers if they decide just to use Amazon’s [or Google’s] full vertical suite of services from voice recognition to active search and recommendation prioritising their Amazon video, according to Thevenot.

Amazon has thought ahead – ensuring that iPlayer, ITV Hub, All 4, My5 and other content providers with apps available on Amazon’s Fire TV can provide direct recommendations to users through an app- or service-specific recommendation carousel.

“The closed eco-system of pay-TV that’s been in place to protect content and control rights is transitioning,” says Bryant. This is particularly so as a rising number of operators no longer have exclusive content, instead partnering with best-inclass video services such as Netflix. This, says the analyst, has the potential to extend into voice. Android TV has already begun to appear in a number of operator’s systems, therefore the migration to using third party voice control is a viable option and would significantly reduce investment costs.

“Potentially this reduces the operator’s control of the consumer (and control of data) but is counterbalanced by the potential reduction in churn such a proposition would offer,” he says. “Both CE vendors and pay-TV providers considering whether to integrate Amazon/ Google/Apple in their devices will not find it easy to give up a key component of their UI to a third party, especially if that company happens to be a competitor (as an OTT and/or device vendor). As VA’s improve and become more life-like, voice will undoubtedly become an important brand differentiator for tech and non-tech brands.”

Battle for the end-user

Consumers will presumably prefer one rather than multiple virtual assistants listening to them so accommodation will have to be found between developers of VAs on the one hand and operators looking to move into home automation. “I would assume it’s in the interests of all parties to concentrate on core competence and partnership,” says Tomer, pointing out that Amazon doesn’t have call centres should anything go wrong with Alexa but MSO’s, of course, do. With Comscore predicting that by 2020, 200 billion voice searches will be done per month, advances such as voice (and fingerprint) biometrics and improved contextual understanding will have huge implications.

“The holy grail is natural language and communication which requires context and retention,” says Bryant. “To have a ‘normal conversation’ the VA will need to know who they are speaking to, the history with that person, who, what, where, why and when.” What interests ruwido’s Maier is the concept of continuous recognition. Based on the device used (eg, the remote control) the system will be able to identify the user without requiring to speak or sign in, he suggests. It will be done the moment the user is holding the remote or the user starts interacting with the system.

“From a technical viewpoint, AI platforms need to evolve to be able to support as much input data as possible. Beyond today’s standard information like text or voice, we need to incorporate electrical signals or more complex patterns like gestures, that can be handled within milliseconds (and do not require computation in the cloud). So, the next tendency in terms of technology will be decentralisation and computation on small devices that are connected via the IoT.

Emotional intelligence

There are a variety of other measurements that could be used such as heart pulse, skin conductance, humidity, temperature to guage a consumer’s emotional response to content and products marketed to them. Understanding and influencing consumers’ emotion is increasingly considered a vital ingredient for business success. But according to Forrester, what is surprising is how emotion has been so poorly measured and incorporated into experience design and core operations. Toyota’s new concept car called Yui will measure the driver’s emotions ultimately building a relationship with them.

“It goes beyond just driving patterns and schedules, making use of multiple technologies to measure emotion, mapped against where and when the driver travels in the world,” says the car firm. Understanding and using this level of personal behaviour requires a greater degree of private data sharing and security.

“If a material of a device is ‘locally’ intelligent, like measuring your pulse (without the need to store or transmit that data in the cloud) privacy might be less of a concern than if a voice biometric identification is lost in the cloud and a user then loses the ability to use that form of identification,” says Maier. “Data has to be shared in a secure way and it has to be clear to the user if this data is shared, with whom, and how and where data is stored. Being honest about these strategies and making them transparent will help to develop trust towards the system. With trust, using biometric data becomes more sensible, especially if the user can measure what the immediate benefit is. Currently, though, the policies are not at a stage that allows that trading – it is more an all or nothing principle.”

8 comments:

  1. Great information! I everyday search in best online virtual assistance, it’s found best resources this article. It’s really helpful. social media virtual assistant

    ReplyDelete
  2. Companies World Health Organization ar grip the most recent technology, particularly the net, are looking forward to on-line virtual assistants. more

    ReplyDelete
  3. With the advancement in technology telecommunication media has also improved its work. Now more accurate signals and channels are to be seen as compared to past. It's all because of technology advancement.

    ReplyDelete
  4. For authors, it's to be mentioned That study papers regarding some aspect of library and information science Are qualified. Research papers could be composed and performed from the Quest for A individual's master's or behavioral studies. domestic violence outline paper is the best one option to proofreading your writing.

    ReplyDelete
  5. You have choose a great article. But there are some little mistake. First of all you can not choose front size this is very small. And you should decorate more in the post. visit this sitepersonal statement writing service and see how to post and article.

    ReplyDelete
  6. Virtual Assistant Thanks for a very interesting blog. What else may I get that kind of info written in such a perfect approach? I’ve a undertaking that I am simply now operating on, and I have been at the look out for such info.

    ReplyDelete
  7. I just want to let you know that I just check out your site and I find it very interesting and informative.. Conversational virtual receptionist

    ReplyDelete