State-of-the-Art Voice Scrubbing Technology Can Help Speech Analytics Systems Avoid the GDPR Minefield

David Garrod December 04, 2018

Call center recordings contain a trove of valuable insights about customer satisfaction, service quality, agent performance, and customer churn. Speech analytics systems can uncover these insights through speech-to-text conversion of the recorded calls, followed by analysis of the converted text – often using traditional text analytics techniques – to reveal insights about a firm’s customer interactions.  

Overlying this analytics flow is a web of data privacy regulations, including the EU’s General Data Protection Regulations (GDPR), which took effect in the spring of 2018. The GDPR imposes strict regulations on the storage and processing of personal information, such as customer names, phone numbers, credit card numbers, and the like. Because such personal information is not needed to support the text analytics process, best practices call for the redaction of personal information from the converted text, so that it is neither stored, nor processed, in violation of the GDPR. Additionally, if call recordings are to be retained, then best practices would further require that the identified personal information be masked in the recorded audio.

While such “audio redaction” processes represent the current state of the art in privacy compliance for stored audio, they still leave certain GDPR risks unaddressed. For example, any private data that was not correctly converted in the speech-to-text conversion process (e.g., “Ty Jones,” converted as “my phone”) will escape the audio redaction process. Moreover, because a person’s recorded voice may be used to create a “voiceprint” that can later be used to identify the person, the mere recording itself qualifies as personal information under the GDPR, even if nothing of a personal nature was communicated.  

One simple “solution” that avoids GDPR risk would be to discard to the audio recordings after they are converted to text. But this impairs the usefulness of the overall system. For instance, a manager who receives an analytics alert that one of her agents is performing poorly would naturally want to listen to the recorded calls that created the alert. Finding that these calls had been discarded would frustrate the manager’s ability to determine whether the agent actually said the words that triggered the alert.

A better solution would be to improve the traditional audio redaction process to address the requirements of modern privacy laws, like GDPR. Fortunately, technology exists that can address the shortcomings of traditional audio redaction. For example, the problem with mis-identified words escaping the redaction process can be addressed through use of a recognition confidence metric to identify additional spoken words that should be redacted. In other words, if the speech-to-text engine indicates low confidence in its conversion of certain words, then those “low confidence” words can also be automatically redacted from the recording, since there is no way to know whether or not they contain personal information that needs to be scrubbed. Furthermore, the “voiceprint” threat to personal privacy can be addressed through use of de-identification technology to process the recorded audio. This will effectively make everybody’s recorded voice sound the same – thus rendering “voiceprints” useless. These improvements will produce redacted call recordings that retain their value as a business tool, yet present much lower risk to the business under the GDPR.

David Garrod

David Garrod, M.S.E.E., Ph.D., J.D., is the General Counsel of Voci Technologies, Inc., as well as its designated GDPR Data Protection Officer.

Access our ASR API

With up to 1000 hours of audio at no charge