Audio-Video Speaker Diarization for Unsupervised Speaker and Face Model Creation

Pavel Campr; Marie Kunešová; Jan Vaněk; Jan Čech; Josef Psutka

Publikace

Všechny publikace

Detail publikace

Citace

Pavel Campr and Marie Kunešová and Jan Vaněk and Jan Čech and Josef Psutka : Audio-Video Speaker Diarization for Unsupervised Speaker and Face Model Creation . Text, Speech and Dialogue, Proceedings of the 17th International Conference TSD 2014, Lecture Notes in Artificial Intelligence, 2014.

Abstrakt

Our goal is to create speaker models in audio domain and face models in video domain from a set of videos in an unsupervised manner. Such models can be used later for speaker identification in audio domain (answering the question "Who was speaking and when") and/or for face recognition ("Who was seen and when") for given videos that contain speaking persons. The proposed system is based on an audio-video diarization system that tries to resolve the disadvantages of the individual modalities. Experiments on broadcasts of Czech parliament meetings show that the proposed combination of individual audio and video diarization systems yields an improvement of the diarization error rate (DER).

Detail publikace

Název:	Audio-Video Speaker Diarization for Unsupervised Speaker and Face Model Creation
Autor:	Pavel Campr ; Marie Kunešová ; Jan Vaněk ; Jan Čech ; Josef Psutka
Jazyk publikace:	anglicky
Datum vydání:	1.9.2014
Rok vydání:	2014
Typ publikace:	Stať ve sborníku
Název knihy:	Text, Speech and Dialogue, Proceedings of the 17th International Conference TSD 2014
Svazek:	Lecture Notes in Artificial Intelligence
DOI:	10.1007/978-3-319-10816-2_56

/ 2017-09-21 12:11:45 /

Klíčová slova

audio-video speaker diarization, audio speaker recognition, face recognition

BibTeX

@INCOLLECTION{PavelCampr_2014_Audio-VideoSpeaker,
 author = {Pavel Campr and Marie Kune\v{s}ov\'{a} and Jan Van\v{e}k and Jan \v{C}ech and Josef Psutka},
 title = {Audio-Video Speaker Diarization for Unsupervised Speaker and Face Model Creation},
 year = {2014},
 booktitle = {Text, Speech and Dialogue, Proceedings of the 17th International Conference TSD 2014},
 series = {Lecture Notes in Artificial Intelligence},
 doi = {10.1007/978-3-319-10816-2_56},
 url = {http://www.kky.zcu.cz/en/publications/PavelCampr_2014_Audio-VideoSpeaker},
}

Pozice katedry v rámci univerzity

Oddělení katedry

Publikace

Detail publikace

Citace

Abstrakt

Detail publikace

Klíčová slova

BibTeX