Skip to content

Detail of publication

Citation

Zajic Zbynek and Hruz Marek and Muller Ludek : Speaker Diarization Using Convolutional Neural Network for Statistics Accumulation Refinement . Interspeech, 18th Annual Conference of the International Speech Communication Association, p. 3562-3566, 2017.

Download PDF

PDF

Abstract

The aim of this paper is to investigate the benefit of information from a speaker change detection system based on Convolutional Neural Network (CNN) when applied to the process of accumu- lation of statistics for an i-vector generation. The investigation is carried out on the problem of diarization. In our system, the output of the CNN is a probability value of a speaker change in a conversation for a given time segment. According to this probability, we cut the conversation into short segments that are then represented by the i-vector (to describe a speaker in it). We propose a technique to utilize the information from the CNN for the weighting of the acoustic data in a segment to refine the statistics accumulation process. This technique enables us to represent the speaker better in the final i-vector. The experi- ments on the English part of the CallHome corpus show that our proposed refinement of the statistics accumulation is beneficial with the relative improvement of Diarization Error Rate almost by 16 % when compared to the speaker diarization system with- out statistics refinement.

Detail of publication

Title: Speaker Diarization Using Convolutional Neural Network for Statistics Accumulation Refinement
Author: Zajic Zbynek ; Hruz Marek ; Muller Ludek
Language: English
Year: 2017
Type of publication: Conferences presentations outside the Czech Republic
Title of journal or book: Interspeech, 18th Annual Conference of the International Speech Communication Association
Page: 3562 - 3566
DOI: 10.21437/Interspeech.2017-51
/ 2017-10-31 12:30:21 /

Keywords

Convolutional Neural Network, Speaker Change Detection, Speaker Diarization, i-vector, Statistics Accumulation

BibTeX

@INPROCEEDINGS{ZajicZbynek_2017_SpeakerDiarization,
 author = {Zajic Zbynek and Hruz Marek and Muller Ludek},
 title = {Speaker Diarization Using Convolutional Neural Network for Statistics Accumulation Refinement},
 year = {2017},
 journal = {Interspeech, 18th Annual Conference of the International Speech Communication Association},
 pages = {3562-3566},
 doi = {10.21437/Interspeech.2017-51},
 url = {http://www.kky.zcu.cz/en/publications/ZajicZbynek_2017_SpeakerDiarization},
}