Interspeech 2007 logo
August 27-31, 2007

Antwerp, Belgium
Antwerp Cathedral tower Antwerp main square Antwerp harbour in twilight
printer friendly version of this page

ThD.SS - Novel techniques for the NATO non-native Air Traffic Control and HIWIRE cockpit databases

Thursday, August 30, 2007,  Room Astrid Scala 1

Session Chairs: David van Leeuwen, TNO Human Factors and Alex Potamianos, Technical University of Crete

After having studied various aspects of speech in noise, speech under stress, and non-native speech, the NATO research task group on speech and language technology, IST-RTG013, has been studying the effects of all of these factors on various speech technologies. To this end, the task group has collected a corpus of military Air Traffic Control communication in Belgian air space. This speech material consists predominantly of non-native English speech, under varying noise and channel conditions. The database consists of 16 hours of training speech, plus one hour of development and evaluation test sets and has been annotated at several levels:

  • Word transcriptions, which allow research to be conducted on automatic speech recognition and named entity extraction,
  • Speaker turns, identified by call signs, allowing for research in speaker recognition and clustering and tracking of conversations.

The database consists of 16 hours of training speech, plus one hour of development and evaluation test sets.

The NATO research task group is making this annotated speech database available for speech researchers, who want to develop novel algorithms for this challenging material. These new algorithms could include noise-robust speaker recognition, robust speaker and accent adaptation for ASR, and context driven named entity detection. In order to facilitate a common task, we have written a suggested test and evaluation plan to guide researchers. The NATO group invites speech technology researchers to report experiments on this database at this special session.

New: The HIWIRE cockpit database, consists of English utterances (Direct Voice Input domain) uttered by non-native speakers. Various noise conditions are applied, and two research tracks (robust non-native and non-native adaptation) are defined. Training and testing scripts are provided for each research track for this database. We encourage speech researchers to submit original work in the areas of feature extraction, feature normalization, pronounciation modeling, acoustic modeling and adaptation.

Data order form

Research groups interested to obtain the speech data can fill out this form, which on completion will be sent to the NATO research group. We will then ship the speech database together with the annotations at no costs.
New: To obtain a copy of the HIWIRE database please contact or and a copy will be mailed to you free of charge.

Program - oral session

16:00 - Introduction, welcome

16:05 - Presentation of the databases:

  • Description of the nn-MATC and HIWIRE databases, David van Leeuwen and Alexandros Potamianos, TNO Human Factors and Technical University of Crete
  • Design and characterization of the Non-native Military Air Traffic Communications database (nnMATC), Stephane Pigeon (Royal Military School), Wade Shen (MIT Lincoln Laboratory), Aaron Lawson (Air Force Research Laboratory), David van Leeuwen (TNO Human Factors)

16:20 - A Comparison of Speaker Clustering and Speech Recognition Techniques for Air Situational Awareness, Wade Shen and Douglas Reynolds, MIT Lincoln Labs

16:35 - Advanced Front-end for Robust Speech Recognition in Extremely Adverse Environments, Dimitrios Dimitriadis, Jose C. Segura, Luz Garcia, Vassilis Pitsikalis, Petros Maragos and Alexandros Potamianos, National Technical University of Athens and University of Granada

16:50 - Experiments on Hiwire database using Denoising and Adaptation with an hybrid HMM-ANN Model, Roberto Gemello, Franco Mana, and Scanzio Stefano, Loquendo and Politecnico di Torino

17:05 - Detection and Removal of Switching Noise in Push-to-Talk (PTT) and Voice Operated eXchange (VOX) Communications Systems, Brett Smolenski, Research Associates for Defense Conversion

17:20 - Evaluation of the Combined Use of MEMLIN and MLLR on the Nonnative Adaptation Task of Hiwire Project Database, Luis Buera, Antonio Miguel, Oscar Saz, Eduardo Lleida, Alfonso Ortega, University of Zaragoza

17:35 - Panel Discussion: "Effectiveness of sharing project-related speech databases amongst the research community", panelists: Roger Moore (Univ. Sheffield, UK), Alex Potamianos (Tech. Univ. of Crete, Greece), Jose Segura (University of Grenada, Spain), Wade Shen (MIT Lincoln Laboratory, US), David van Leeuwen (TNO Human Factors, The Netherlands)

18:00 End

Contact

Special Session's own web page: http://speech.tm.tno.nl/nn-matc/joint.html
Website for the NATO non native Air Traffic Control task: http://speech.tm.tno.nl/nn-matc/
New: Website for the HIWIRE non native Cockpit Communication task: http://speech.tm.tno.nl/nn-matc/hiwire.html

Session organizers:

David van Leeuwen
TNO Human Factors
P. O. Box 23
3769 ZG Soesterberg
The Netherlands
Alex Potamianos
Department of Electronic and Computer Engineering
Technical University of Crete
73100 Chania
Greece

ISCA logo Universiteit Antwerpen logo Radboud University Nijmegen logo Katholieke Universiteit Leuven logo