Publikace
Detail publikace
Citace
: Dynamic Threshold Selection Method for Multi-label Newspaper Topic Identification . Text, Speech and Dialogue, Lecture Notes in Computer Science, vol. 8082, p. 209-216, Springer, Heidelberg, 2013.
PDF ke stažení
Další informace
Abstrakt
Nowadays, the multi-label classification is increasingly required in modern categorization systems. It is especially essential in the task of newspaper article topics identification. This paper presents a method based on general topic model normalisation for finding a threshold defining the boundary between the "correct" and the "incorrect" topics of a newspaper article. The proposed method is used to improve the topic identification algorithm which is a part of a complex system for acquisition and storing large volumes of text data. The topic identification module uses the Naive Bayes classifier for the multiclass and multi-label classification problem and assigns to each article the topics from a defined quite extensive topic hierarchy - it contains about 450 topics and topic categories. The results of the experiments with the improved topic identification algorithm are presented in this paper.
Detail publikace
| Název: | Dynamic Threshold Selection Method for Multi-label Newspaper Topic Identification |
|---|---|
| Autor: | Skorkovská, L. |
| Jazyk publikace: | anglicky |
| Datum vydání: | 1.9.2013 |
| Rok vydání: | 2013 |
| Typ publikace: | Článek z časopisu |
| Název časopisu / knihy: | Text, Speech and Dialogue |
| Svazek: | Lecture Notes in Computer Science |
| Číslo vydání: | 8082 |
| Strana: | 209 - 216 |
| DOI: | 10.1007/978-3-642-40585-3_27 |
| ISBN: | 978-3-642-40584-6 |
| ISSN: | 0302-9743 |
| Nakladatel: | Springer |
| Místo vydání: | Heidelberg |
| Datum: | 1.9.2013 - 5.9.2013 |
Klíčová slova
topic identification, multi-label text classification, language modeling, Naive Bayes classification
BibTeX
@ARTICLE{SkorkovskaL_2013_DynamicThreshold,
author = {Skorkovsk\'{a}, L.},
title = {Dynamic Threshold Selection Method for Multi-label Newspaper Topic Identification},
year = {2013},
publisher = {Springer},
journal = {Text, Speech and Dialogue},
address = {Heidelberg},
volume = {8082},
pages = {209-216},
series = {Lecture Notes in Computer Science},
ISBN = {978-3-642-40584-6},
ISSN = {0302-9743},
doi = {10.1007/978-3-642-40585-3_27},
url = {http://www.kky.zcu.cz/en/publications/SkorkovskaL_2013_DynamicThreshold},
}


ZČU
