Real Benefits from Artificial Intelligence

Bill von Hagen December 06, 2018

My previous blog gave an overview of the transcription process and explained some buzzwords that you've probably already encountered. It also touched upon the use of AI (artificial intelligence) in Voci's general ASR (Automatic Speech Recognition) system, a specific STT (Speech-to-Text) product (V-Blaze), and the V-Cloud API and extensions. After that bowl of alphabet soup, let's take a closer look at some of the AI technologies that Voci uses to deliver impressively accurate transcriptions.

AI has fascinated computer scientists and business people since its birth in academic publications and science fiction novels. Today's AI techniques are light years beyond the early AI days of Eliza and LISP machines, especially in the area of speech recognition. One of the key AI technologies that Voci's speech recognition implements is deep learning, a machine learning technique that enables software to do what comes naturally to humans: learn by example. 

Voci's speech recognition software learns its vocabulary and the ways that terms can interact by analyzing thousands of hours of audio documents within a target subject area (domain). The software uses this data to build and populate a neural net, a huge hierarchical structure composed of nodes containing words that have been encountered in the input and weighted connections to other words that have previously been used with that one. This dynamic network structure enables Voci's machine learning systems to be constantly teaching and improving themselves - words that have been identified in the input are integrated into the neural net to affect the future recognition and selection of other words from the input.

Speech recognition is complemented and enhanced by Natural Language Processing (NLP), a general container for many different types of interaction between people and computers that are driven by language. In our case, natural language processing techniques are used to analyze the structure and flow of the output of the Voci STT software, further enforcing accurate and meaningful transcriptions.

AI has come a long way from the early days of simple rule-based systems, in which everything had to be predefined and arcane computers were required. Today, almost everyone uses AI tools whether they know it or not, with Google, Apple's Siri, or Microsoft's Cortana helping search personal systems or the Internet. Every time you interact with them, they are collecting relevant data that they can somehow reuse, including augmenting a search index or adding to their deep learning training material. Home devices such as Amazon's Alexa and Google Home have commercialized voice recognition, especially if you'd like to buy something from one of their partners or top-ten vendors!

Voci's software brings focused, detailed AI to call centers out of the box, and is flexible and powerful enough to target any business domain. Even Eliza doesn't have to ask, "How do you feel about that?" — the proof is in the software and the results.

In the next few blogs, I'm going to start looking at specific ways that customers are using Voci's AI and other tools to build smart solutions that derive actionable insights from customer interaction.

Here are a few useful links for more information about deep learning and AI in speech recognition:

Bill von Hagen

Bill von Hagen is a writer who loves Linux, spicy food, and classic AI and UNIX workstations, though not necessarily in that order.

Access our ASR API

With up to 1000 hours of audio at no charge