Announcing 2Hz Speech Enhancement API, now in Preview

We are excited to announce that 2Hz Krisp API is in Preview now. Please Apply for Access if you would like to try it out.

This is the first Speech Enhancement API of its kind.

The API is capable of making three enhancements to speech, all powered by state of the art Deep Learning technologies.

/recover – fixes voice breakups in the audio caused by bad network conditions. Learn more about fixing voice breakups in this blog post

/denoise – suppresses background noise and leaves only human voice in the audio. Learn more about noise suppression in this blog post

/expand – expands the audio resolution from lowband to wideband. Learn more about bandwidth expansion in this blog post

An example request looks like this:

Continue reading Announcing 2Hz Speech Enhancement API, now in Preview

Meet Krisp: AI Powered Noise Suppression on your Laptop

At 2Hz we are building world’s best Noise Suppression technology. And we’ve achieved an incredible quality.

We use Deep Learning in our approach and our latest algorithm increases MOS score by 1.2 points in average on a large dataset of noisy speech. This is a remarkable result. So far we haven’t seen any other tech coming close to this.

Meet Krisp

We’re happy to share today that we are bringing this technology to YOUR FINGERTIPS.

In the last 4 months we have been baking Krisp (krisp.ai). An App designed for laptops, which after installing, upgrades your laptop’s microphone and speaker and adds a magical “Mute Noise” button to it to use during conference calls.

You can use Krisp with any Conferencing App you prefer, out of the box.

The cool thing is: you can both mute the noise going from you to the conference participants and also mute the noise coming from them to you. Bi-directional mute.

And it’s free.

Krisp has entered Private Beta last week. Learn more about it in this blog post. and subscribe for Beta here.

Fixing Voice Breakups with Deep Learning

At 2Hz we are continuously rethinking traditional approaches to known problems in Voice processing and disrupting them by applying deep learning.

Our last article has discussed the problem of bandwidth expansion in voice audio.

This time it’s PLC’s turn.

PLC (Packet Loss Concealment) is a well known problem in voice communications. It’s also known to every telecommunication user in the world. Everyone, literally everyone who used VoIP Apps or Cellular Phone has experienced “chopped voice”. When network conditions are bad our voice is cutting off and sounds annoying and funny. Early Skype users remember this very well. We sound like “he e ey how aaaaaare yoo ooo”?

In this article we demonstrate how our Deep Neural Network (DNN) powered PLC algorithm (krispNet-PLC) compares to existing state of the art PLC technologies.

The full article can be found here:

https://2hz.ai/blog/fixing-voice-breakups/index.html

HD Voice Playback with Deep Learning

2Hz is committed to developing technologies which improve Voice Audio Quality in Real Time Communications.

One contributor to poor voice quality is the legacy infrastructure powered by 8kHz sampling based G.711 codec. While most of our phones can capture wideband audio (up to 48kHz) the codecs used by cellular networks downsample audio to 8kHz (lowband audio).

8kHz sampled audio can capture the frequency range the human ear is the most sensitive with however our voice still sounds like “coming from a tunnel” and is not pleasant enough. This is because of the absence of higher frequencies of our voice in the audio.

Artificial Bandwidth Expansion (we call it HD Voice Playback) refers to the idea of upsampling a lowband audio to wideband audio in a way that it improves voice quality. This technique has been around for many years. For example you can use ffmpeg open source tool to perform artificial expansion. ffmpeg up-samples the audio to 16kHz however it doesn’t enrich it. The end result still sounds like coming from a tunnel.

In this article we describe a Deep Learning based HD Voice Playback. We call our designed DNN krispNet. The full article an be found here:

http://2hz.ai/blog/hd-voice-playback/index.html

Noise Cancellation: Server-Side or Device-Side

We discussed traditional multi-mic based noise cancellation in the previous post. Such technologies can be applied on user device (phone, laptop) only where multiple mics are available.

In this post we will discuss the challenges related with running noise cancellation technology on the Server Side.

When we’ve built a fully software based noise cancellation technology at 2hz.ai, a profound question came up — why can’t we run this technology on Sever side rather than phones or laptops?

There is a big value proposition for Communications Service Provider companies here: independent on what devices their users are using all these conversations can be noise cancelled at the backend side.

See, when a new iPhoneX with a better noise cancellation comes out — it doesn’t have much impact on a Service Provider such as Twilio, RingCentral, Fuze or WebEx. This is because iPhoneX is only a fraction of their overall device population. But if they could noise cancel (denoise) every communication independent on user devices — there is a big value in it.

Even more. When you are in the backend you have access to both legs of a call and you can denoise both legs. So you not only make your user’s life “noise-free” but potentially also all the other users they are talking to (users outside your network).

Sounds like a no brainer. However it isn’t as simple as it sounds. Let’s talk now about some challenges.

Continue reading Noise Cancellation: Server-Side or Device-Side

Noise Cancellation: State of the Art

At 2Hz we’ve spent our last 1.5 years building a disruptive Noise Cancellation technology powered by Deep Neural Networks. And I must say it is a tough road.

In this article I’ll share with you the current state of the art for Noise Cancellation.

Before I start I want to clarify what exactly I mean by Noise Cancellation.

When I say Noise Cancellation I mean suppressing the Noise going from the caller and coming to the caller from other end. Imagine you are in a subway and you call a friend who is at the airport. By Noise Cancellation we mean suppressing the subway noise before sending to your friend (while you might still hear it) and also suppressing the airpot noise coming from their environment to you.

Active Noise Cancellation (ANC) refers to suppressing unwanted noise coming to your ears from the external environment surrounding you. For Active Noise Cancellation you typically need headphones (such as Bose QuiteComfort).

Active Noise Cancellation

In this article we will focus only on Noise Cancellation and not Active Noise Cancellation.

Continue reading Noise Cancellation: State of the Art