Interspeech 2007 logo
August 27-31, 2007

Antwerp, Belgium
Antwerp Cathedral tower Antwerp main square Antwerp harbour in twilight
printer friendly version of this page

TuD.SS - Speech and Audio Processing for Intelligent Environments

Tuesday, August 28, 2007, Astrid Plaza hotel, Room Scala 1

Session chairs: Reinhold Haeb-Umbach, University of Paderborn, Germany and Zheng-Hua Tan, Aalborg University, Denmark

Ambient Intelligence (AmI) describes the vision of technology that is invisible, embedded in our surroundings and present whenever we need it. Interacting with it should be simple and effortless. The systems can think on their own and can make our lives easier with subtle or no direction.

Since the early days of this computing and interaction paradigm speech has been considered a major building block of AmI. The purpose of speech and audio processing is twofold:

  • Support of explicit interaction: Speech as an input/output modality that facilitates the aforementioned simple and effortless interaction, preferably in cooperation with other modalities like gesture.
  • Support of implicit interaction: Speech, and acoustic signals in general, as a source of context information which provide valuable information, such as “who speaks when and where”, to be utilized in systems that are context-aware, personalized, adaptive, or even anticipatory.

The goal of this special session is to give an overview of major achievements, but also to highlight major challenges. Does state-of-the-art of speech and audio processing meet the high expectations expressed in the scenarios of AmI, will it ever do? We would also like to address in this special session what are the perspectives and promising concepts for the future.

Program

16:00 – 16:20 Introduction to poster session

The poster session will start with a brief introduction to each contribution. Each poster presenter shall prepare a two-minute oral presentation, possibly supported by slides, to give the audience an overview of the poster. Slides must be prepared and uploaded according to the guidelines for oral presentations in regular sessions.

16:20 – 16:40 Invited Tutorial

Aki Härmä (Philips Research, Eindhoven): Ambient Telephony: scenarios and research challenges

Telecommunications at home is changing rapidly. Many people have moved from the traditional PSTN phone to the mobile phone. Now for increasingly many people Voice-over-IP telephony on a PC platform is becoming the primary technology for voice communications. In this tutorial paper we give an overview of some of the current trends and try to characterize the next generation of home telephony, in particular, the concept of ambient telephony. We give an overview of the research challenges in the development of ambient telephone systems and introduce some potential solutions and scenarios.

16:40 – 18:00 Poster Session

  • Joint Speaker Segmentation, Localization and Identification for Streaming Audio, Joerg Schmalenstroee and Reinhold Haeb-Umbach, University of Paderborn, Germany
  • Active binaural distance estimation for dynamic sources, Yan-Chen Lu, Martin Cooke and Heidi Christensen,  University of Sheffield, UK
  • Always Listening to You: Creating Exhaustive Audio Database in Home Environments, Yasunari Obuchi and Akio Aman, Hitachi Ltd., Japan
  • A Packetization and Variable Bitrate Interframe Compression Scheme For Vector Quantizer-Based Distributed Speech Recognition, Bengt J. Borgström and Abeer Alwan, University of California, Los Angeles, USA
  • Channel Selection by Class Separability Measures for Automatic Transcriptions on Distant Microphones, Matthias Wölfel, Universität Karlsruhe, Germany
  • Conversation Detection and Speaker Segmentation in Privacy-Sensitive Situated Speech Data, Danny Wyatt, Tanzeem Choudhury and Jeff Bilmes, University of Washington and Intel Research, U.S.A.
  • Audio-based approaches to head orientation estimation in a smart-room, Alberto Abad, Carlos Segura, Climent Nadeu and Javier Hernando, Universitat Politècnica de Catalunya, Spain
  • Multi-Resolution Soft Features for Channel-Robust Distributed Speech Recognition, Valentin Ion and Reinhold Haeb-Umbach, University of Paderborn, Germany

Contact

Session organizers:
Prof. Dr. Reinhold Haeb-Umbach
Department of Communications Engineering
University of Paderborn, Germany
  Prof. Dr. Zheng-Hua Tan
Department of Electronic Systems
Aalborg University, Denmark

ISCA logo Universiteit Antwerpen logo Radboud University Nijmegen logo Katholieke Universiteit Leuven logo