Skip to content

Detail of publication

Citation

Jakub Kanis and Lucie Skorkovská : Comparison of Different Lemmatization Approaches through the Means of Information Retrieval Performance . Lecture Notes in Artificial Intelligence, LNAI, vol. 2010, p. 93-100, Springer, Heidelberg, 2010.

Download PDF

PDF

Abstract

This paper presents a quantitative performance analysis of two different approaches to the lemmatization of the Czech text data. The first one is based on manually prepared dictionary of lemmas and set of derivation rules while the second one is based on automatic inference of the dictionary and the rules from training data. The comparison is done by evaluating the mean Generalized Average Precision (mGAP) measure of the lemmatized documents and search queries in the set of information retrieval (IR) experiments. Such method is suitable for efficient and rather reliable comparison of the lemmatization performance since a correct lemmatization has proven to be crucial for IR effectiveness in highly inflected languages. Moreover, the proposed indirect comparison of the lemmatizers circumvents the need for manually lemmatized test data which are hard to obtain and also face the problem of incompatible sets of lemmas across different systems.

Detail of publication

Title: Comparison of Different Lemmatization Approaches through the Means of Information Retrieval Performance
Author: Jakub Kanis ; Lucie Skorkovská
Language: English
Date of publication: 1 Sep 2010
Year: 2010
Type of publication: Papers in journals
Title of journal or book: Lecture Notes in Artificial Intelligence
Series: LNAI
Číslo vydání: 2010
Page: 93 - 100
ISSN: 0302-9743
Publisher: Springer
Address: Heidelberg
/ 2012-05-25 14:28:41 /

Keywords

lemmatization, information retrieval

BibTeX

@ARTICLE{JakubKanis_2010_Comparisonof,
 author = {Jakub Kanis and Lucie Skorkovsk\'{a}},
 title = {Comparison of Different Lemmatization Approaches through the Means of Information Retrieval Performance},
 year = {2010},
 publisher = {Springer},
 journal = {Lecture Notes in Artificial Intelligence},
 address = {Heidelberg},
 volume = {2010},
 pages = {93-100},
 series = {LNAI},
 ISSN = {0302-9743},
 url = {http://www.kky.zcu.cz/en/publications/JakubKanis_2010_Comparisonof},
}