Přejít na obsah

Detail publikace

Citace

Campr Pavel and Pražák Aleš and Psutka Josef V. and Psutka Josef : Online Speaker Adaptation of an Acoustic Model using Face Recognition . Text, Speech and Dialogue, Proceedings of the 16th International Conference TSD 2013, Lecture Notes in Artificial Intelligence, vol. 8082, p. 378-385, Springer Berlin Heidelberg, 2013.

PDF ke stažení

PDF

Abstrakt

We have proposed and evaluated a novel approach for online speaker adaptation of an acoustic model based on face recognition. Instead of traditionally used audio-based speaker identification we in vestigated the video modality for the task of speaker detection. A simulated on-line transcription created by a large-vocabulary continuous speech recognition (LVCSR) system for online subtitling is evaluated utilizing speaker independent acoustic models, gender dependent models and models of particular speakers. In the experiment, the speaker dependent acoustic models were trained offline, and are switched online based on the decision of a face recognizer, which reduced Word Error Rate (WER) by 12% relatively compared to speaker independent baseline system.

Detail publikace

Název: Online Speaker Adaptation of an Acoustic Model using Face Recognition
Autor: Campr Pavel ; Pražák Aleš ; Psutka Josef V. ; Psutka Josef
Název - česky: Online adaptace akustického modelu na řečníka s využitím systému pro rozpoznávání obličejů
Jazyk publikace: anglicky
Rok vydání: 2013
Typ publikace: Článek z časopisu
Název knihy: Text, Speech and Dialogue, Proceedings of the 16th International Conference TSD 2013
Svazek: Lecture Notes in Artificial Intelligence
Číslo vydání: 8082
Strana: 378 - 385
DOI: 10.1007/978-3-642-40585-3_48
ISBN: 978-3-642-40584-6
ISSN: 0302-9743
Nakladatel: Springer Berlin Heidelberg
/ 2013-09-13 13:07:19 /

Klíčová slova

acoustic model, speaker adaptation, face recognition, multimodal processing, automatic speech recognition

BibTeX

@INCOLLECTION{CamprPavel_2013_OnlineSpeaker,
 author = {Campr Pavel and Pra\v{z}\'{a}k Ale\v{s} and Psutka Josef V. and Psutka Josef},
 title = {Online Speaker Adaptation of an Acoustic Model using Face Recognition},
 year = {2013},
 publisher = {Springer Berlin Heidelberg},
 volume = {8082},
 pages = {378-385},
 booktitle = {Text, Speech and Dialogue, Proceedings of the 16th International Conference TSD 2013},
 series = {Lecture Notes in Artificial Intelligence},
 ISBN = {978-3-642-40584-6},
 ISSN = {0302-9743},
 doi = {10.1007/978-3-642-40585-3_48},
 url = {http://www.kky.zcu.cz/en/publications/CamprPavel_2013_OnlineSpeaker},
}