We previously wrote about our work on deep neural networks for speech enhancement. In late August, we presented our newest results as a paper and a poster at the speech technology conference Interspeech 2017 in Stockholm, Sweden.
A lot of people could use some help with their hearing, but getting a hearing aid has traditionally been a big, time-consuming, and expensive step. As we reported earlier, the Oslo-based company Listen have therefore been developing an app in collaboration with SINTEF that turns your iPhone into a hearing aid.
Using deep learning to improve the intelligibility of noise-corrupted speech signals
Speech is key to our ability to communicate. We already use recorded speech to communicate remotely with other humans and we will get more and more used to machines that simply ‘listen’ to us. However, we want our phones, laptops, hearing aids, voice controlled and/or Internet of Things (IoT) devices to work in every environment — the majority of environments being noisy.
This creates the need for speech enhancement techniques that remove noise from recorded speech signals. Yet, as of today, there are no noise-filtering strategies that significantly help people understand single-channel noisy speech, and even state-of-the-art voice assistants fail miserably in noisy environments. Some recent publications on speech enhancement show that deep learning, a machine learning subfield based on deep neural networks (DNNs), will become a game-changer in the field of speech enhancement. See for example reference  below.
In this blog post we will go through a relatively simple implementation of Deep Learning to speech enhancement. Scroll down to the end of this post if you just want to know what the resulting enhanced samples can sound like.
Sharleen has a hearing impairment, and couldn’t understand what the teacher was saying. Her father thought he needed her at home to look after the goats. Unfortunately, Sharleen is just one among many children lacking help.
WHO has estimated that over 5% of the world population – 360 million people – has a hearing impairment (328 million adult and 32 million children), and the majority of children with hearing impairment live in low-income countries. In contrast, less than 2% of the hearing aids produced in 2005 went to low income countries.
Traditional hearing devices are advanced equipment; expensive, fragile and not developed for the Third World. Specialised personnel and complex infrastructure in the individual fitting process is needed, reducing the usefulness of such complex hearing aids to a minimum in low-income countries, where trained people and specialists are scarce.
With funding from Norwegian Research Council, SINTEF’s project “I Hear You”, starting early 2017, aims to help children like Sharleen by ensuring access to education for the hearing impaired.
Horns are used in many fields, including musical wind instruments and loudspeakers. The physics in the two cases is of course the same: sound propagation in a flaring duct open at one end. Therefore we can in principle use the same simulation methods for both cases. But what we want to obtain from the horn simulation can be very different.
A very important requirement for horn loudspeakers is directivity control. This entails directing sound into a specific region in front of the horn, giving the same frequency response inside that region and little sound outside it. Any simulation method for horn speakers must be capable of predicting directivity. Horn speakers should not be resonant, but should present a constant and smooth acoustic load to the driving unit, so this is also an important, but somewhat less critical, factor in the design.
For wind instruments, we are usually interested in the resonance frequencies. This is important for the tone, intonation and playability, and it is useful if we can predict this when designing the instrument. Any simulation method must therefore be able to predict these frequencies accurately. Or we may have an old valuable instrument and want to find the internal shape without cutting it into pieces. Then we can use an optimization algorithm to solve the inverse problem of finding the internal shape from measured resonance frequencies. For this, the simulation method must be fast.
Noise induced hearing loss (NIHL) is one of the most common occupational diseases. This is a fact even if most countries have legislations specifying how much sound employees can be exposed to. Therefore new models for NIHL seem to be necessary to reduce the risk of developing hearing disorders.
In the Norwegian petroleum industry much attention has been paid to occupational noise and hearing damage in the last decade. Statoil ASA has, in collaboration with Honeywell, been involved in several projects at SINTEF with this in mind. The current ongoing project is called Next Step (Noise Exposure Tackled Safely Through Ear Protection).
The 1920s saw much development in horn loudspeakers, and loudspeaker in general. Western Electric already had their microphones, amplifiers, straight exponential horns, and very good balanced armature transducers. At this time, much research was also put into disc recording and reproduction at the Western Electric Engineering Department, and simultaneously, optical recording of sound was also in progress, using Wente’s Light Valve. The time seemed ripe to attempt sound film. The story has been told elsewhere, but in short, most of the industry turned down Western Electric’s offer. They “knew” sound film would not work. But the Warner Bros found in the WE system something that could help them beat the big guys in the industry, and after the success of their first sound film, the rest is history.
It is hard to tell when horns first were used. They have been in use for thousands of years as instruments, and man must early have discovered the amplifying effect of a pair of cupped hands in front of his mouth, or behind his ears. Ear trumpets were early implementations of this, and the first hearing aids.
Horns were used on phonographs and gramophones from the start. This was the only way to get the required volume from the tiny motions of the needle. The theoretical understanding of horns was still small though, and most of the work was experimental. Early models used conical horns, but as theory progressed, the superiority of the exponential horn was recognized.
Reis’ telephone was perhaps the first loudspeaker of any kind, as it employed a magnetostriction driver mounted in a resonating box. But it would still take many years before inventors discovered the virtues of baffles and enclosures. As Hunt puts it, the baffle is probably the most frequently rediscovered feature of loudspeaker art. Stokes, in 1868, pointed out that the radiation efficiency could be improved by preventing air circulation around the edges of a vibrating surface (the acoustic short-circuit). Rayleigh, a few years later, gave the now classic analysis of the radiation from a piston in an infinite baffle. But by the time loudspeakers were being produced in great numbers, Rayleigh’s Theory of Sound had been out of print for more than two decades, and many inventors discovered the baffle before they discovered Rayleigh.
The invention of the telephone set off a wave of creativity, and almost all conceivable transducer mechanisms were tried out in the 1870s and 80s. Some of them developed into usable devices, others serve mainly as illustrations of man’s creativity. In this part, some of them will be presented, ranging from useful, mainstream designs to the downright bizarre.