2Hz Voice Library: AI-powered voice audio enhancement

We are thrilled to announce 2Hz Voice Library – a software library in C implementing next-generation voice enhancement algorithms.

2Hz Voice Library: AI-powered voice enhancement library

The library is designed to be integrated into wide-range of devices and applications –  headsets, smartphones, mobile apps, media servers, cars, radio devices, etc. Effectively, anything that interacts with a human voice.

The library can be embedded virtually everywhere

2Hz Voice Library implements two features:
Voice Activity Detection and Noise Suppression.

Both algorithms are powered by 2Hz’s specially designed Deep Neural Network and significantly outperform what’s out there in the market.

Noise Suppression

2Hz noise suppression technology supports audio stream coming from anywhere – be it a single-source microphone or stream coming from the network.

Details can be found in our guest post on NVIDIA Developer Blog.

Because the library is fully software-based and doesn’t require a multi-microphone system – it can suppress noise in both directions – outbound and inbound.

The technology works well both on stationary and non-stationary noises.

Technical Specs:

Min Requirements: runs real-time on 200MHz ARM CPU
Platforms: ARM, Intel x86, NVIDIA GPU, Intel GNA, Qualcomm Hexagon
I/O: receives a 30ms audio frame (PCM 16), returns processed 30ms frame
Algorithmic Latency
: 15ms
Quality: Increases PESQ by 1.7 on average

This technology is currently powering Krisp app, an app used by tens of thousands users worldwide.

Voice Activity Detection

Voice Activity Detection (VAD) has numerous use cases – wake word detection in Alexa-like applications, gaming, network bandwidth-sensitive voice apps, codecs, etc.

2Hz VAD has an unmatched Precision and Recall scores in noisy environments. Its results are much better than the one found in WebRTC or other places. 

Technical Specs:

Min Requirements: runs real-time on 200MHz ARM CPU
Platforms: ARM, Intel x86, NVIDIA GPU, Intel GNA, Qualcomm Hexagon
I/O: receives a 10ms audio frame (PCM 16), returns a probability
Quality: Precision: 97%, Recall: 95% in low SNR (high noise) situations

Use Cases

Use cases vary a lot. Below are some of them:

  • Better two-way call quality (Telephony, Conferencing, VoIP, …)
  • Noise removal for voice messages (WeChat, WhatsApp, …)
  • Simpler microphone placement for devices (Phones, Cars, …)
  • Flexible audio/video publishing experience (Podcasts, Live Streams, …)
  • Better Push to talk experience (First responders, Police, …)

Start using the library

Please leave your details and we will contact you.

We are thrilled to learn about your use case.