# Automatic Speech Recognition

Speech is the most important natural human communication form. Thus, speech recognition is the most natural interface for applications and allows development of applications that enable human communication across language and culture barriers.

In our research, we are concentrating on techniques and algorithms for improving speech and text applications that are robust in multilingual environments. This includes the rapid development of language recognizers for new tasks and languages - here, it is necessary to significantly reduce the time and costs required.

We believe that this is an essential prerequisite for making voice-driven applications attractive to the user and for a further spread of these applications in speech areas for which there are only few or no resources at present.

### Relevante Publikationen

Filter publications:
2016
[53] Towards Automatic Transcription of ILSE – an Interdisciplinary Longitudinal Study of Adult Development and Aging (, , , , , ), In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), .
[52] Speech-Based Detection of Alzheimer's Disease in Conversational German (, , ), In INTERSPEECH 2016 – 17\textsuperscriptth Annual Conference of the International Speech Communication Association, .
[51] Detection of Intra-Personal Development of Cognitive Impairment From Conversational Speech (, ), In 12th ITG Conference on Speech Communication, .
2015
[50] Error Signatures to identify Errors in ASR in an unsupervised fashion (, , ), In Proceedings of the Errare Workshop (ERRARE 2015), .
[49] Cross-lingual Lexical Language Discovery from Audio Data Using Multiple Translations (, , , ), In The 40th International Conference on Acoustics, Speech, and Signal Processing, Brisbane, Australia, . (ICASSP 2015) [bibtex]
[48] Continuous Speech Recognition from ECoG (, , , , , , ), In Sixteenth Annual Conference of the International Speech Communication Association, .
[47] Syntactic and Semantic Features For Code-Switching Factored Language Models (, , , , , ), In IEEE/ACM Transactions on Audio, Speech and Language Processing, volume 23, . [bibtex]
[46] Brain-to-text: Decoding spoken phrases from phone representations in the brain (, , , , , , ), In Frontiers in Neuroscience, volume 9, .
2014
[45] Investigating the Learning Effect of Multilingual Bottle-Neck Features for ASR (, , ), In The 15th Annual Conference of the International Speech Communication Association, Singapore, . (Interspeech 2014)
[44] Improving ASR Performance On Non-native Speech Using Multilingual and Crosslingual Information (, , , , ), In The 15th Annual Conference of the International Speech Communication Association, Singapore, . (Interspeech 2014)
[43] BioKIT - Real-time Decoder For Biosignal Processing (, , , , , , , , , , , ), In The 15th Annual Conference of the International Speech Communication Association, Singapore, . (Interspeech 2014)
[42] Word Segmentation and Pronunciation Extraction from Phoneme Sequences Through Cross-Lingual Word-to-Phoneme Alignment (, , , ), In Computer Speech & Language, Elservier, . [bibtex]
[41] Towards Automatic Speech Recognition without Pronunciation Dictionary, Transcribed Speech and Text Resources in the Target Language using Cross-Lingual Word-to-Phoneme Alignment (, , , ), In The 4th Workshop on Spoken Language Technologies for Under-resourced Languages, St. Petersburg, Russia, . (SLTU 2014)
[40] GlobalPhone: Pronunciation Dictionaries in 20 Languages (, ), In The 9th edition of the Language Resources and Evaluation Conference, Reykjavik, Iceland, . (LREC 2014)
[39] Web-based Tools and Methods for Rapid Pronunciation Dictionary Creation (, , ), In Speech Communication, â118, January 2014., volume 56, . [bibtex]
[38] Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping (, , ), In The 15th Annual Conference of the International Speech Communication Association, Singapore, . (Interspeech 2014)
[37] Combining Grapheme-to-Phoneme Converter Outputs for Enhanced Pronunciation Generation in Low-Resource Scenarios (, , ), In The 4th Workshop on Spoken Language Technologies for Under-resourced Languages, St. Petersburg, Russia, . (SLTU 2014)
[36] AKTIV: Multimodal Interaction System to Engage Patients with Dementia (, , , , , , , ), Chapter in Technische Unterstützung für Menschen mit Demenz, KIT Scientific Publishing, . [bibtex]
[35] Automatic Detection of Anglicisms for the Pronunciation Dictionary Generation: A Case Study on our German IT Corpus (, , ), In The 4th Workshop on Spoken Language Technologies for Under-resourced Languages, St. Petersburg, Russia, . (SLTU 2014)
[34] Features for Factored Language Models for Code-Switching Speech (, , , , , ), In The 4th Workshop on Spoken Language Technologies for Under-resourced Languages, St. Petersburg, Russia, . (SLTU 2014)
[33] Comparing Approaches to Convert Recurrent Neural Networks into Backoff Language Models For Efficient Decoding (, , , , ), In The 15th Annual Conference of the International Speech Communication Association, Singapore, . (Interspeech 2014) [bibtex]
[32] Combining Recurrent Neural Networks and Factored Language Models During Decoding of Code-Switching Speech (, , , , ), In The 15th Annual Conference of the International Speech Communication Association, Singapore, . (Interspeech 2014) [bibtex]
2013
[31] Multilingual Multilayer Perceptron For Rapid Language Adaptation Between and Across Language Families (, ), In 14th Annual Conference of the International Speech Communication Association, Lyon, France, . (Interspeech 2013)
[30] An Investigation of Code-Switching Attitude Dependent Language Modeling (, , ), In The 1st International Conference on Statistical Language and Speech Processing, . (SLSP 2013)
[29] Accent- and Speaker-Specific Polyphone Decision Trees for Non-Native Speech Recognition (, ), In 14th Annual Conference of the International Speech Communication Association, Lyon, France, .
[28] Pronunciation Extraction from Phoneme Sequences through Cross-Lingual Word-to-Phoneme Alignment (, , , ), In The 1st International Conference on Statistical Language and Speech Processing, . (SLSP 2013)
[27] GlobalPhone: A Multilingual Text & Speech Database in 20 Languages (, , ), In The 38th International Conference on Acoustics, Speech, and Signal Processing, . (ICASSP 2013)
[26] Unsupervised Language Model Adaptation for Automatic Speech Recognition of Broadcast News Using Web 2.0 (, , , ), In 14th Annual Conference of the International Speech Communication Association, Lyon, France, . (Interspeech 2013)
[25] Statistical Machine Translation based Text Normalization with Crowdsourcing (, , , ), In The 38th International Conference on Acoustics, Speech, and Signal Processing, . (ICASSP 2013)
[24] Rapid Bootstrapping of a Ukrainian Large Vocabulary Continuous Speech Recognition System (, , , ), In The 38th International Conference on Acoustics, Speech, and Signal Processing, . (ICASSP 2013)
[23] Experiments towards a better LVCSR System for Tamil (, , ), In 14th Annual Conference of the International Speech Communication Association, Lyon, France, . (Interspeech 2013)
[22] Recurrent Neural Network Language Modeling for Code Switching Conversational Speech (, , , , , ), In The 38th International Conference on Acoustics, Speech, and Signal Processing, . (ICASSP 2013)
[21] Combination of Recurrent Neural Networks and Factored Language Models for Code-Switching Language Modeling (, , ), In The 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, . (ACL 2013)
2012
[20] Integration Of Language Identification Into A Recognition System For Spoken Conversations Containing Code-Switches (, , , , , , , ), In The third International Workshop on Spoken Languages Technologies for Under-resourced Languages, . (SLTU'12)
[19] Multilingual Bottleneck Features and Its Application for Under-resourced Languages (, , ), In The third International Workshop on Spoken Languages Technologies for Under-resourced Languages, Cape Town, South Africa, . (SLTU'12)
[18] Initialization Schemes for Multilayer Perceptron Training and Their Impact on ASR Performance using Multilingual Data (, , , , ), In 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, . (Interspeech 2012)
[17] A First Speech Recognition System For Mandarin-English Code-Switch Conversational Speech (, , , , , , , , ), In 37th International Conference on Acoustics, Speech, and Signal Processing, Kyoto, Japan, . (ICASSP 2012)
[16] Word Segmentation through Cross-Lingual Word-to-Phoneme Alignment (, , , ), In The Fourth IEEE Workshop on Spoken Language Technology, . (SLT 2012)
[15] Hausa Large Vocabulary Continuous Speech Recognition (, , , , ), In The third International Workshop on Spoken Languages Technologies for Under-resourced Languages, Cape Town, South Africa, . (SLTU'12)
[14] Grapheme-to-Phoneme Model Generation for Indo-European Languages (, , ), In 37th International Conference on Acoustics, Speech, and Signal Processing, Kyoto, Japan, . (ICASSP 2012)
[13] Automatic Error Recovery for Pronunciation Dictionaries (, , , ), In 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, . (Interspeech 2012)
[12] Generating Exact Lattices in the WFST Framework (, , , , , , , , , , , , ), In IEEE International Conference on Acoustics, Speech and Signal Processing, . (ICASSP)
[11] Initial Experiments with Tamil LVCSR (, , ), In The International Conference on Asian Language Processing, Hanoi, Vietnam, . (IALP) [bibtex]
2011
[10] Rapid building of an ASR system for Under-Resourced Languages based on Multilingual Unsupervised Training (, , ), In 12th Annual Conference of the International Speech Communication Association, . (Interspeech 2011)
[9] Cross-language bootstrapping based on completely unsupervised training using multilingual A-stabil (, , ), In IEEE International Conference on Acoustics, Speech and Signal Processing, .
[8] Speech Recognition for Machine Translation in Quaero (, , , , , , , , , , , , , , , , , , , , ), In The International Workshop on Spoken Language Translation, San Francisco, USA, . (IWSLT 2011)
2010
[7] Rapid Bootstrapping of five Eastern European Languages using the Rapid Language Adaptation Toolkit (, , , ), In 11th Annual Conference of the International Speech Communication Association, Makuhari, Japan, . (Interspeech 2010)
[6] Optimization On Vietnamese Large Vocabulary Speech Recognition (, ), In 2nd Workshop on Spoken Languages Technologies for Under-resourced Languages, .
[5] Multilingual A-stabil: A new confidence score for multilingual unsupervised training (, , ), In IEEE Workshop on Spoken Language Technology, .
[4] Wiktionary as a Source for Automatic Pronunciation Extraction (, , ), In 11th Annual Conference of the International Speech Communication Association, Makuhari, Japan, . (Interspeech 2010)
[3] Text Normalization based on Statistical Machine Translation and Internet User Support (, , , ), In 11th Annual Conference of the International Speech Communication Association, Makuhari, Japan, . (Interspeech 2010)
2009
[2] Vietnamese Large Vocabulary Continuous Speech Recognition (, ), In Automatic Speech Recognition and Understanding, .
2008
[1] Diacritization as a Translation Problem and as a Sequence Labeling Problem (, , ), In Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai'i, . (AMTA 2008)

Contact
Cognitive Systems Lab
Prof. Dr.-Ing. Tanja Schultz

Enrique-Schmidt-Str. 5
28359 Bremen
Germany

Phone: +49 (0) 421 218 64270
E-Mail: