Publikace
Detail publikace
Citace
p. 378-385, Springer Berlin Heidelberg, 2013. : Online Speaker Adaptation of an Acoustic Model using Face Recognition . Text, Speech and Dialogue, Proceedings of the 16th International Conference TSD 2013, Lecture Notes in Artificial Intelligence, vol. 8082,
PDF ke stažení
Abstrakt
We have proposed and evaluated a novel approach for online speaker adaptation of an acoustic model based on face recognition. Instead of traditionally used audio-based speaker identification we in vestigated the video modality for the task of speaker detection. A simulated on-line transcription created by a large-vocabulary continuous speech recognition (LVCSR) system for online subtitling is evaluated utilizing speaker independent acoustic models, gender dependent models and models of particular speakers. In the experiment, the speaker dependent acoustic models were trained offline, and are switched online based on the decision of a face recognizer, which reduced Word Error Rate (WER) by 12% relatively compared to speaker independent baseline system.
Detail publikace
Název: | Online Speaker Adaptation of an Acoustic Model using Face Recognition |
---|---|
Autor: | Campr Pavel ; Pražák Aleš ; Psutka Josef V. ; Psutka Josef |
Název - česky: | Online adaptace akustického modelu na řečníka s využitím systému pro rozpoznávání obličejů |
Jazyk publikace: | anglicky |
Rok vydání: | 2013 |
Typ publikace: | Článek z časopisu |
Název knihy: | Text, Speech and Dialogue, Proceedings of the 16th International Conference TSD 2013 |
Svazek: | Lecture Notes in Artificial Intelligence |
Číslo vydání: | 8082 |
Strana: | 378 - 385 |
DOI: | 10.1007/978-3-642-40585-3_48 |
ISBN: | 978-3-642-40584-6 |
ISSN: | 0302-9743 |
Nakladatel: | Springer Berlin Heidelberg |
Klíčová slova
acoustic model, speaker adaptation, face recognition, multimodal processing, automatic speech recognition
BibTeX
@INCOLLECTION{CamprPavel_2013_OnlineSpeaker, author = {Campr Pavel and Pra\v{z}\'{a}k Ale\v{s} and Psutka Josef V. and Psutka Josef}, title = {Online Speaker Adaptation of an Acoustic Model using Face Recognition}, year = {2013}, publisher = {Springer Berlin Heidelberg}, volume = {8082}, pages = {378-385}, booktitle = {Text, Speech and Dialogue, Proceedings of the 16th International Conference TSD 2013}, series = {Lecture Notes in Artificial Intelligence}, ISBN = {978-3-642-40584-6}, ISSN = {0302-9743}, doi = {10.1007/978-3-642-40585-3_48}, url = {http://www.kky.zcu.cz/en/publications/CamprPavel_2013_OnlineSpeaker}, }