Skip to content

Detail of publication

Citation

Skorkovska, L. : JMZW: Topic Identification in Czech Newspaper Articles . SVK 2011 - magisterské a doktorské studijní programy, sborník rozšířených abstraktů, p. 95-96, Západočeská univerzita v Plzni, Plzeň, 2011.

Abstract

Topic identification module is a part of the complex system for acquisition and storing large volumes of text data from the Web called JMZW - Jazykové modelování z webu. This module processes each acquired text item, mostly newspaper article, and automatically assigns keywords from a predefined topic hierarchy to it.The main purpose of the JMZW system is to acquire and process data for training of extensive language models used in Automatic Speech Recognition systems. Since it has been shown that a smaller topic specific language model can outperform a much bigger general one, it is important to filter the gathered data according to its topics.

Detail of publication

Title: JMZW: Topic Identification in Czech Newspaper Articles
Author: Skorkovska, L.
Language: English
Date of publication: 26 May 2011
Year: 2011
Type of publication: Papers in proceedings of reviewed conferences
Title of journal or book: SVK 2011 - magisterské a doktorské studijní programy, sborník rozšířených abstraktů
Page: 95 - 96
ISBN: 978-80-261-0000-3
Publisher: Západočeská univerzita v Plzni
Address: Plzeň
Date: 26 May 2011 - 26 May 2011
/ 2011-10-17 13:28:57 /

Keywords

topic identification, newspaper, language models

BibTeX

@INPROCEEDINGS{SkorkovskaL_2011_JMZWTopic,
 author = {Skorkovska, L.},
 title = {JMZW: Topic Identification in Czech Newspaper Articles},
 year = {2011},
 publisher = {Z\'{a}pado\v{c}esk\'{a} univerzita v Plzni},
 journal = {SVK 2011 - magistersk\'{e} a doktorsk\'{e} studijn\'{i} programy, sborn\'{i}k roz\v{s}\'{i}\v{r}en\'{y}ch abstrakt\r{u}},
 address = {Plze\v{n}},
 pages = {95-96},
 ISBN = {978-80-261-0000-3},
 url = {http://www.kky.zcu.cz/en/publications/SkorkovskaL_2011_JMZWTopic},
}