WeD.SS - Speech and language technology for less-resourced languages
Wednesday, August 29, 2007, Astrid Plaza hotel, Room Scala 1
Session Chairs: Briony Williams, University of Wales, UK, Mikel Forcada, Universitat d'Alacant, Spain, and Kepa Sarasola, University of the Basque Country, Spain
Speech and language technology researchers who work on less-resourced languages often have very limited access to funding, equipment and software. This makes it all the more important for them to come together to share best practice, in order to avoid a duplication of effort. This special session will therefore be devoted to speech and language technology for less-resourced languages.
In view of the limited resources available to the targeted researchers, there will be a particular emphasis on "free" software, which may be either open-source or closed-source. However, submissions are also invited from those using commercial software.
Program
16:00 – 18:00 Poster Session
A Morpho-graphemic Approach for the Recognition of Spontaneous Speech in Agglutinative Languages – like Hungarian, Péter Mihajlik, Tibor Fegyó, Zoltán Tüske, and Pavel Ircing, Budapest University of Technology and Economics, AITIA International (Hungary), and University of West Bohemia, Plzen (Czech Republic)
A Semi-Supervised Learning Approach for Morpheme Segmentation for An Arabic Dialect, Mei Yang, Jing Zheng, and Andreas Kathol, University of Washington and SRI International (USA)
Accelerating the Annotation of Lexical Data for Less-Resourced Languages,Gerhard Beukes van Huyssteen and Martin Johannes Puttkammer, Centre for Text Technology (CTexT), North-West University (South Africa)
On Web-based Speech Resource Creation for Less-Resourced Languages, Christoph Draxler, Institut für Phonetik und Sprachverarbeitung (Germany)
Building an Information Retrieval System for Serbian - Challenges and Solutions, Miroslav Martinovic, Srdan Vesic and Goran Rakic, College of New Jersey (USA) and University of Belgrade (Serbia)
Bootstrapping Morphological Analysis of Gĩkũyũ Using Unsupervised Maximum Entropy Learning,Guy De Pauw and Peter Waiganjo Wagacha, University of Antwerp (Belgium) and University of Nairobi (Kenya)
The VoiceTRAN Machine Translation System, Jerneja Zganec Gros and Stanislav Gruden, Alpineon Research (Slovenia)
MuLAS: A Framework For Automatically Building Multi-Tier Corpora, Sergio Paulo and Luis C. Oliveira, INESC-ID/IST (Portugal)
Creating multimedia dictionaries of endangered languages using LEXUS, Jacquelijn Ringersma and Kemps-Snijders, Max Planck Institute for Psycholinguistics (The Netherlands)
IceNLP: A Natural Language Processing Toolkit for Icelandic, Hrafn Loftsson and Eiríkur Rögnvaldsson, Reykjavik University and University of Iceland (Iceland)
Phonotactic spoken language identification with limited training data, Marius Peche, Marelie Davel and Etienne Barnard, Meraka Institute (South Africa)
Automatic Speech Recognition for an Under-Resourced Language, Amharic Solomon Teferra Abate and Wolfgang Menzel, University of Hamburg (Germany)
Information Retrieval Strategies for Accessing African Audio Corpora, Abdillahi Nimaan, Pascal Nocera, Frédéric Bechet, Jean-François Bonastre, Laboratoire Informatique d’Avignon - UAPV (France) and Institut des Sciences et des Nouvelles Technologies - CERD (Djibouti)
Morfessor and VariKN machine learning tools for speech and language technology, Vesa Siivola, Mathias Creutz, and Mikko Kurimo, Helsinki University of Technology (Finland)
Towards Better Language Modeling for Thai LVCSR, Markpong Jongtaveesataporn, Issara Thienlikit, Chai Wutiwiwatchai, Tokyo Institute of Technology (Japan) and National Electronics and Computer Technology Center (Thailand)
16:00 – 18:00 Extra demos (not in IS2007 proceedings)
SpeechIndexer in Action: Managing Endangered Formosan Languages, Jozsef Szakos and Ulrike Glavitsch, National Dong Hua University (Taiwan) and ETH Zurich (Switzerland)
A Portable Record Player for Wax Cylinders using a Laser-beam Reflection Method, Tohru Ifukube and Yasuyuki Shimizu, University of Tokyo and Japan Women's University (Japan)
ELAN: a Free and Open Source Multimedia Annotation Tool, Han Sloetjes, Albert Russel, Alexander Klassmann, Max Planck Institute for Psycholinguistics (The Netherlands)