Search

Menu

MIP3 Modelling speech intelligibility based on the signal-to-noise ratio in the modulation domain

 

The main goal of this project is to find explanations for confusions of consonants, typically made by listeners in difficult listening situations. This is approached from a modelling perspective, i.e., a model will be developed that should predict the average pattern of consonant confusions for normal-hearing and hearing-impaired listeners when presented with non-sense syllables.

Background : This project is concerned with “microscopic” speech intelligibility, which means that the focus is to investigate the fundamental building blocks of speech, i.e., the phonemes ("p", "t", "i", etc.). The phonemes tend to be confused with each other in adverse conditions, often following a specific pattern of confusions. For example, “b” may be heard as “n” when there is a certain amount of background noise. In the approach used here, additional “high-level” information - such as vocabulary and syntactic structure of the language - is intentionally neglected in order to investigate how the acoustic information is decoded by the auditory system, leading to our perception of speech.

Approach: The confusions will be investigated using an auditory signal processing model which considers the same acoustic signal as presented to the human listener. In order to obtain a successful model, much attention has to be paid to the acoustic characteristics considered in the model. In this project, one of the considered characteristics is the slow fluctuations in the level of the signal, which has been shown to be crucial for “macroscopic” speech intelligibility of meaningful sentences. The goal is to investigate to what extent these level fluctuations matter at the microscopic level as well, i.e., for the recognition and confusion of individual phonemes.

Relevance: This research is highly relevant for various applications, such as hearing aid development, automatic speech recognition, and automatic speech production. In other words, one can try to imitate the excellent speech recognition capabilities of the human auditory system and in doing so help recover the information that is lost due to hearing-impairment, thus making speech intelligible again for hearing-impaired individuals. Furthermore, the information obtained can help us find effective ways of making computer applications understand speech commands more accurately and produce more intelligible speech.

Main host institution: Technical University of Denmark

Second host institution: University College London

Industry partner: Phonak AG

FacebookMySpaceTwitterDiggDeliciousStumbleuponGoogle BookmarksRedditNewsvineTechnoratiLinkedinMixxRSS FeedPinterest
<a href=London public event" />
Join us for a fun event! "Good listeners and smooth talkers: Spoken communica-tion in a challenging world", 7.00pm, Tuesday 20 January, Royal Institution, London
Read more ...
<a href=The Big Listen!" />
Help researchers develop the next generation of hearing aids by taking "The Big Listen", a 5-minute online listening test developed as part of the INSPIRE project.
Read more ...

Log in to INSPIRE

Event calendar

October 2017
Mon Tue Wed Thu Fri Sat Sun
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

INSPIRE News

  • We would like to warmly invite you to join our Radboud Summer School course on "Multilingualism in the Wild" Radboud University Nijmegen, The Netherlands http://www.ru.nl/radboudsummerschool/ Dates: 10-14...

  • The INSPIRE workshop "Computational models of cognitive processes" will take place in Leuven, Belgium, from Wednesday 1 July to Saturday 4 July, 2015. Click here for workshop...

  • The INSPIRE winter school "Talker-listener interactions" will take place in London, England, from Tuesday 20 January to Friday 23 January, 2015. Click here for winter school information.

Go to top