Amazon and other leading technology companies today announced the Voice Interoperability Initiative, a new program to ensure voice-enabled products provide customers with choice and flexibility through multiple, interoperable voice services. The initiative aims to have multiple voice assistants work next to each other on devices, rather than having customers have to lock themselves into one voice ecosystem.
The initiative is built around a shared belief that voice services should work seamlessly alongside one another on a single device, and that voice-enabled products should be designed to support multiple simultaneous wake words.
More than 30 companies are supporting the effort, including global brands like Amazon, Baidu, BMW, Bose, Cerence, ecobee, Harman, Logitech, Microsoft, Salesforce, Sonos, Sound United, Sony Audio Group, Spotify and Tencent; telecommunications operators like Free, Orange, SFR and Verizon; hardware solutions providers like Amlogic, InnoMedia, Intel, MediaTek, NXP Semiconductors, Qualcomm, SGW Global and Tonly; and systems integrators like CommScope, DiscVision, Libre, Linkplay, MyBox, Sagemcom, StreamUnlimited and Sugr.
“Multiple simultaneous wake words provide the best option for customers,” said Jeff Bezos, Amazon founder and CEO.
“Utterance by utterance, customers can choose which voice service will best support a particular interaction. It’s exciting to see these companies come together in pursuit of that vision.”
Notably, Samsung, Google, and Apple are not part of the Voice Interoperability Initiative.
The Voice Interoperability Initiative is built around four priorities
- Developing voice services that can work seamlessly with others, while protecting the privacy and security of customers
- Building voice-enabled devices that promote choice and flexibility through multiple, simultaneous wake words
- Releasing technologies and solutions that make it easier to integrate multiple voice services on a single product
- Accelerating machine learning and conversational AI research to improve the breadth, quality and interoperability of voice services
Companies participating in the Voice Interoperability Initiative will work with one another to ensure customers have the freedom to interact with multiple voice services on a single device. On products that support multiple voice services, the best way to promote customer choice is through multiple simultaneous wake words, so customers can access each service simply by saying the corresponding wake word. Customers get to enjoy the unique skill and capabilities of each service, from Alexa and Cortana to Djingo, Einstein, and any number of emerging voice services.
Companies participating in the initiative are committed to adopting a similar technological approach, whether building voice-enabled products or developing voice services and assistants of their own.
Developers and device makers have a shared commitment to customer trust, and will work together to protect the security and privacy of customers interacting with multiple voice services. Companies participating in the initiative will work to ensure this commitment extends to products that support multiple, simultaneous wake words.
Making multiple, simultaneous wake words more accessible for developers and device makers
Alexa machine learning and speech science technology is designed to support multiple, simultaneous wake words. As a result, any device maker building with the Alexa Voice Service (AVS) can build powerful, differentiated products that feature Alexa alongside other voice services.
Still, device makers interested in supporting multiple, simultaneous wake words often face higher development costs and increased memory load on their devices. To address this, the Voice Interoperability Initiative will also include support from hardware providers like Amlogic, Intel, MediaTek, NXP Semiconductorsand Qualcomm Technologies, Inc.; original design manufacturers (ODMs) like InnoMedia, Tonly and SGW Global; and systems integrators like CommScope, DiscVision, Libre, Linkplay, MyBox, Sagemcom, StreamUnlimited and Sugr.
As part of the initiative, these companies will develop products and services that make it easier and more affordable for OEMs to support multiple wake words on their devices.
Advancing the state of the art in machine learning and wake word technology
The academic community has played a vital role in advancing the core machine learning and conversational AI behind voice technology. Companies involved in the initiative will work with researchers and universities to further accelerate the state of the art in machine learning and wake word technology, from developing algorithms that allow wake words to run on portable, low-power devices to improving the encryption and APIs that ensure voice recording are routed securely to the right destination.
This continued innovation will provide an important building block for long-term advancements that improve the quality, breadth and interoperability of voice services in the future.
Participating companies will have more detail to share on the initiative and compatible products in the coming months.