Camilla Mazzolini
4 min readJan 14, 2020

--

Why we invested in Oto Systems — the voice recognition software that helps you understand intonation

While a lot of work has been done around speech recognition with Amazon’s Alexa, Apple’s Siri, and Google’s Assistant leading the way, we have seen little research in voice intonation classification, which could reveal 5 times more information.

This is where Oto Systems comes in. Aiming to disrupt the $49 billion voice market, Oto raised a $3m million seed round in August 2019 (and a total of $5.2m to date), which we led alongside Fusion Fund, Bleu Capital, SAP.iO, and SRI International.

What Oto does

Oto Systems provides software to call centre agents who want to monitor their phone conversations in real time in order to help unlock valuable experiential data from voice interactions. By bridging acoustic modalities (tone) and lexical modalities (words) to facilitate superior speech analysis, the technology unlocks the critical understanding of the intonation, instead of just understanding what words are being uttered. Transcribed words without intonation are surprisingly ambiguous. See the Mehrabian formula below.

The Mehrabian Formula

As such, Oto Systems has positioned itself at the intersection of speech analytics and customer experience. They understand the power of using intonation to measure the quality of conversations and how that can be used more broadly to manage customer experience — a rising priority for companies.

The technology

Spun-out of SRI International, best known for incubating Siri and Nuance, Oto is able to 1) coach call center agents in real-time to help them sound more energetic and win more sales thanks to improved intonation, and 2) analyse customer’s tone to predict their satisfaction on 100% of calls to enable powerful retargeting opportunities.

This field is called Acoustic Language Processing, and is one of the next evolutions of Natural Language Processing. As SRI are world leaders in speech technology, this gave us great confidence and validation in the founding team and their technology, especially as SRI granted them access to a rich proprietary data set of emotion labeled conversations that has helped them with the cold start problem of training their models. As such, Oto compiled one of the largest sets of emotionally tagged speech data of 100,000 utterances from 3,000 speakers and received more than 2 million conversations from customers. By 2020 they aim to reach 1 millions hours to build AI models that can be applied towards any type of conversations.

In addition to their data moat, their technology is language agnostic — meaning it is scalable across different regions that call centres operate in (they’re currently in the US, Germany, France, Switzerland, LATAM and Netherlands). And finally, the technology is built to be plug and play, requiring little integration effort, and can also run entirely on the agent’s computer.

The products

The first product is a live-coaching tool with a widget that hovers on the call centre agents’ desktop. The widget uses red, blue and green to let the agent know whether the conversation is going poorly, ok or well, and helps them remain engaged during calls. For instance, the agents will see messages like “We’re noticing low energy levels — try to sound more engaged” when they’re becoming less engaged.

Oto’s blue widget — signalling the conversation is going ok

This has three main advantages for customers: 1) it makes their agents more engaged, 2) it results in higher sales conversion rate, and 3) better NPS scores. In one deployment, Oto’s coaching tool increased engagement by 19%, leading to an increased sales conversion rate of 5%. While ACD Direct, one of their customers, has seen conversion rates increase by 18%. Other customers / partners include Direct Energie, Ströer, and Qualtrics.

Their second product is able to predict the actual NPS score (customer satisfaction) from the customer’s intonation. This is transformational because only about 15% of customers respond to post-call surveys, which leaves 85% of interactions as a big unknown in terms of quality and experience. OTO is therefore able to score 100% of interactions allowing to spot detractors and take actions (in real-time) before top customers may churn.

Given that call centres generate huge amounts of proprietary voice data, we believe that having access to these could give Oto a huge data advantage in building a best-in-class voice technology for broader use-cases.

The Team

And finally, behind this ambitious project, is an exceptional team.

The founding team, with Teo Borschberg as CEO and Nicolas Perony as CTO (previously at Hyperloop Transportation as their Tech AI team lead, after having finished his PhD in Complex Systems from ETH), form a complementary partnership. With a great balance between deep tech and go-to-market expertise, they can both build the product and sell it, while also driving longer-term research that will maintain their technology lead.

Founding team: Teo Borschberg (left) and Nicolas Perony (right)

Currently 18 employees, mainly engineers, Oto has offices in New York, Zurich and Lisbon.

Oto Systems are building the next generation of speech technology to humanise conversations by unlocking the trove of behavioural insights found in our daily communications. At firstminute, we are excited to be part of this adventure!

--

--