Publications
Detail of publication
Citation
p. 3562-3566, 2017. : Speaker Diarization Using Convolutional Neural Network for Statistics Accumulation Refinement . Interspeech, 18th Annual Conference of the International Speech Communication Association,
Download PDF
Abstract
The aim of this paper is to investigate the benefit of information from a speaker change detection system based on Convolutional Neural Network (CNN) when applied to the process of accumu- lation of statistics for an i-vector generation. The investigation is carried out on the problem of diarization. In our system, the output of the CNN is a probability value of a speaker change in a conversation for a given time segment. According to this probability, we cut the conversation into short segments that are then represented by the i-vector (to describe a speaker in it). We propose a technique to utilize the information from the CNN for the weighting of the acoustic data in a segment to refine the statistics accumulation process. This technique enables us to represent the speaker better in the final i-vector. The experi- ments on the English part of the CallHome corpus show that our proposed refinement of the statistics accumulation is beneficial with the relative improvement of Diarization Error Rate almost by 16 % when compared to the speaker diarization system with- out statistics refinement.
Detail of publication
Title: | Speaker Diarization Using Convolutional Neural Network for Statistics Accumulation Refinement |
---|---|
Author: | Zajic Zbynek ; Hruz Marek ; Muller Ludek |
Language: | English |
Year: | 2017 |
Type of publication: | Conferences presentations outside the Czech Republic |
Title of journal or book: | Interspeech, 18th Annual Conference of the International Speech Communication Association |
Page: | 3562 - 3566 |
DOI: | 10.21437/Interspeech.2017-51 |
Keywords
Convolutional Neural Network, Speaker Change Detection, Speaker Diarization, i-vector, Statistics Accumulation
BibTeX
@INPROCEEDINGS{ZajicZbynek_2017_SpeakerDiarization, author = {Zajic Zbynek and Hruz Marek and Muller Ludek}, title = {Speaker Diarization Using Convolutional Neural Network for Statistics Accumulation Refinement}, year = {2017}, journal = {Interspeech, 18th Annual Conference of the International Speech Communication Association}, pages = {3562-3566}, doi = {10.21437/Interspeech.2017-51}, url = {http://www.kky.zcu.cz/en/publications/ZajicZbynek_2017_SpeakerDiarization}, }