Skip to content

Detail of publication

Citation

Lucie Skorkovská and Zbyněk Zajíc : Score Normalization Methods Applied to Topic Identification . Text, Speech, and Dialogue, 17th International Conference, TSD 2014, Lecture Notes in Artificial Intelligence, vol. 8655, p. 133-140, Springer, 2014.

Download PDF

PDF

Abstract

Multi-label classification plays the key role in modern categorization systems. Its goal is to find a set of labels belonging to each data item. In the multi-label document classification unlike in the multi-class classification, where only the best topic is chosen, the classifier must decide if a document does or does not belong to each topic from the predefined topic set. We are using the generative classifier to tackle this task, but the problem with this approach is that the threshold for the positive classification must be set. This threshold can vary for each document depending on the content of the document (words used, length of the document, ...). In this paper we use the Unconstrained Cohort Normalization, primary proposed for speaker identification/verification task, for robustly finding the threshold defining the boundary between the correct and the incorrect topics of a document. In our former experiments we have proposed a method for finding this threshold inspired by another normalization technique called World Model score normalization. Comparison of these normalization methods has shown that better results can be achieved from the Unconstrained Cohort Normalization.

Detail of publication

Title: Score Normalization Methods Applied to Topic Identification
Author: Lucie Skorkovská ; Zbyněk Zajíc
Language: English
Year: 2014
Type of publication: Papers in proceedings of reviewed conferences
Title of journal or book: Text, Speech, and Dialogue, 17th International Conference, TSD 2014
Series: Lecture Notes in Artificial Intelligence
Číslo vydání: 8655
Page: 133 - 140
DOI: 10.1007/978-3-319-10816-2_17
ISBN: 978-3-319-10815-5
ISSN: 0302-9743
Publisher: Springer
Date: 8 Sep 2014 - 12 Sep 2014
/ 2014-11-13 10:36:00 /

Keywords

topic identification, multi-label text classification, Naive Bayes classification, score normalization

BibTeX

@INPROCEEDINGS{LucieSkorkovska_2014_ScoreNormalization,
 author = {Lucie Skorkovsk\'{a} and Zbyn\v{e}k Zaj\'{i}c},
 title = {Score Normalization Methods Applied to Topic Identification},
 year = {2014},
 publisher = {Springer},
 journal = {Text, Speech, and Dialogue, 17th International Conference, TSD 2014},
 volume = {8655},
 pages = {133-140},
 series = {Lecture Notes in Artificial Intelligence},
 ISBN = {978-3-319-10815-5},
 ISSN = {0302-9743},
 doi = {10.1007/978-3-319-10816-2_17},
 url = {http://www.kky.zcu.cz/en/publications/LucieSkorkovska_2014_ScoreNormalization},
}