Tuesday, April 3, 2018

Always-listening Hotword Recognition without power drain may soon come to Android apps with TrulyHandsfree

Ever wonder how smartphones such as the Huawei Mate 9 respond to Amazon Alexa and Google Assistant commands ("Alexa", "OK, Google") even when the screen is switched off and locked? It's thanks to a hardware component called a DSP, or digital signal processor, a dedicated audio chip that handles low-power, always-on phrase detection (and other tasks). It's core to popular voice assistants' functionality. Silicon Valley-based company Sensory says its software-based alternative, TrulyHandsfree, gives DSPs a run for their money.

TrulyHandsFree, which the company claims is the "most widely deployed" speech recognition engine in the world, is a wake-word and speech recognition suite designed to support low-power voice recognition in applications across Android, iOS, and other platforms. Sensory says the software has been "re-engineered" for increased accuracy, lower power consumption, and expanded device support.

"Hands-free operation for voice control has become the norm, and application developers are now looking to create hands-free wake words for their own apps," said Todd Mozer, CEO of Sensory, in a statement.

Development of the new and improved TrulyHandsfree began in 2017. Sensory teamed up with chip maker Qualcomm and ARM to figure out how to lower power consumption for voice assistant wake words. It implemented three techniques:

  • Sensory's "little-big" always-listening feature uses a small voice recognition model to identify potential wake words and revalidate those wake words on a large model. It doesn't have demanding power requirements, but it's more accurate without consuming slightly more power.
  • Frame stacking, a method of neural network training that leads to more accurate models and faster decoding, cuts certain wake word model processing functions' MIPS (million instructions per second, a measure of processing performance) in half without impacting accuracy.
  • Multithreading allows more efficient speech recognition processing and improves the execution time for larger wake word models.

Sensory says together the enhancements reduce power consumption on mobile apps by more than 80%, which equates to 200mAh in a 12-hour day.

If you've used the latest version of navigation app Waze, you've already seen the new TrulyHandsfree in action. "We recently helped Google's Waze accept hands-free voice commands by supplying them with Sensory's 'OK Waze' wake word that runs when the app is open," Mr. Mozer said. "With previous versions of TrulyHandsfree, having our always-on wake word engine listening for the OK Waze wake word during a short trip would have had minimal effect on a smartphone's battery, but for longer trips, a more efficient was desired — so we created it."

The latest TrulyHandsfree ships with support for several types of wake word options including fixed words and user-defined wake words. Wake word models for Alexa, Siri, the Google Assistant, Microsoft's Cortana, and systems from Baidu, Alibaba, and Tencent. Multi-wake word recognition and support for multiple languages including English, Dutch, French, Italian, Japanese, Spanish, and Turkish.

Sensory says an updated SDK for Android and iOS will roll out before the end of Q2 2018.



from xda-developers https://ift.tt/2JhKv2s
via IFTTT

No comments:

Post a Comment