Publications
Detail of publication
Citation
p. 378-385, Springer Berlin Heidelberg, 2013. : Online Speaker Adaptation of an Acoustic Model using Face Recognition . Text, Speech and Dialogue, Proceedings of the 16th International Conference TSD 2013, Lecture Notes in Artificial Intelligence, vol. 8082,
Download PDF
Abstract
We have proposed and evaluated a novel approach for online speaker adaptation of an acoustic model based on face recognition. Instead of traditionally used audio-based speaker identification we in vestigated the video modality for the task of speaker detection. A simulated on-line transcription created by a large-vocabulary continuous speech recognition (LVCSR) system for online subtitling is evaluated utilizing speaker independent acoustic models, gender dependent models and models of particular speakers. In the experiment, the speaker dependent acoustic models were trained offline, and are switched online based on the decision of a face recognizer, which reduced Word Error Rate (WER) by 12% relatively compared to speaker independent baseline system.
Detail of publication
Title: | Online Speaker Adaptation of an Acoustic Model using Face Recognition |
---|---|
Author: | Campr Pavel ; Pražák Aleš ; Psutka Josef V. ; Psutka Josef |
Language: | English |
Year: | 2013 |
Type of publication: | Papers in journals |
Book title: | Text, Speech and Dialogue, Proceedings of the 16th International Conference TSD 2013 |
Series: | Lecture Notes in Artificial Intelligence |
Číslo vydání: | 8082 |
Page: | 378 - 385 |
DOI: | 10.1007/978-3-642-40585-3_48 |
ISBN: | 978-3-642-40584-6 |
ISSN: | 0302-9743 |
Publisher: | Springer Berlin Heidelberg |
Keywords
acoustic model, speaker adaptation, face recognition, multimodal processing, automatic speech recognition
BibTeX
@INCOLLECTION{CamprPavel_2013_OnlineSpeaker, author = {Campr Pavel and Pra\v{z}\'{a}k Ale\v{s} and Psutka Josef V. and Psutka Josef}, title = {Online Speaker Adaptation of an Acoustic Model using Face Recognition}, year = {2013}, publisher = {Springer Berlin Heidelberg}, volume = {8082}, pages = {378-385}, booktitle = {Text, Speech and Dialogue, Proceedings of the 16th International Conference TSD 2013}, series = {Lecture Notes in Artificial Intelligence}, ISBN = {978-3-642-40584-6}, ISSN = {0302-9743}, doi = {10.1007/978-3-642-40585-3_48}, url = {http://www.kky.zcu.cz/en/publications/CamprPavel_2013_OnlineSpeaker}, }