Skip to content

Detail of publication

Citation

Ircing, P. and Hoidekr, J. and Psutka, J. : Exploiting linguistic knowledge in language modeling of Czech spontaneous speech . Proceedings of LREC 2006 , p. 2600-2603, ELRA, Paris, 2006.

Abstract

In our paper, we present a method for incorporating available linguistic information into a statistical language model that is used in ASR system for transcribing spontaneous speech. We employ the class-based language model paradigm and use the morphological tags as the basis for world-to-class mapping. Since the number of different tags is at least by one order of magnitude lower than the number of words even in the tasks with moderately-sized vocabularies, the tag-based model can be rather robustly estimated using even the relatively small text corpora. Unfortunately, this robustness goes hand in hand with restricted predictive ability of the class-based model. Hence we apply the two-pass recognition strategy, where the first pass is performed with the standard word-based n-gram and the resulting lattices are rescored in the second pass using the aforementioned class-based model.

Detail of publication

Title: Exploiting linguistic knowledge in language modeling of Czech spontaneous speech
Author: Ircing, P. ; Hoidekr, J. ; Psutka, J.
Language: English
Date of publication: 22 May 2006
Year: 2006
Type of publication: Papers in proceedings of reviewed conferences
Title of journal or book: Proceedings of LREC 2006
Page: 2600 - 2603
ISBN: 2-9517408-2-4
Publisher: ELRA
Address: Paris
Date: 22 May 2006 - 28 May 2006
/ /

Keywords

speech recognition, language modeling, class-based language models

BibTeX

@INPROCEEDINGS{IrcingP_2006_Exploitinglinguistic,
 author = {Ircing, P. and Hoidekr, J. and Psutka, J.},
 title = {Exploiting linguistic knowledge in language modeling of Czech spontaneous speech},
 year = {2006},
 publisher = {ELRA},
 journal = {Proceedings of LREC 2006 },
 address = {Paris},
 pages = {2600-2603},
 ISBN = {2-9517408-2-4},
 url = {http://www.kky.zcu.cz/en/publications/IrcingP_2006_Exploitinglinguistic},
}