Publications
Detail of publication
Citation
p. 132-139, Springer, Berlin, 2005. : Automatic lemmatizer construction with focus on OOV words lemmatization . Lecture Notes in Artificial Intelligence, Lecture notes in artificial intelligence, no. 3658, 3658,
Download PDF
Abstract
This paper deals with the automatic construction of a lemmatizer from a Full Form - Lemma (FFL) training dictionary and with lemmatization of new, in the FFL dictionary unseen, i.e. out-of-vocabulary (OOV) words. Three methods of lemmatization of three kinds of OOV words (missing full forms, unknown words, and compound words) are introduced. These methods were tested on Czech test data. The best result (recall: 99.3 % and precision: 75.1 %) has been achieved by a combination of these methods. The lexicon-free lemmatizer based on the method of lemmatization of unknown words (lemmatization patterns method) is introduced too.
Detail of publication
Title: | Automatic lemmatizer construction with focus on OOV words lemmatization |
---|---|
Author: | Kanis, J. ; Müller, L. |
Language: | English |
Date of publication: | 12 Sep 2005 |
Year: | 2005 |
Type of publication: | Papers in journals |
Title of journal or book: | Lecture Notes in Artificial Intelligence |
Edition: | Lecture notes in artificial intelligence, no. 3658 |
Series: | 3658 |
Page: | 132 - 139 |
ISBN: | 0302-9743 |
ISSN: | 0302-9743 |
Publisher: | Springer |
Address: | Berlin |
Date: | 12 Sep 2005 - 16 Sep 2005 |
Keywords
lemmatization, OOV words
BibTeX
@ARTICLE{KanisJ_2005_Automaticlemmatizer, author = {Kanis, J. and M\"{u}ller, L.}, title = {Automatic lemmatizer construction with focus on OOV words lemmatization}, year = {2005}, publisher = {Springer}, journal = {Lecture Notes in Artificial Intelligence}, address = {Berlin}, pages = {132-139}, series = {3658}, ISBN = {0302-9743}, ISSN = {0302-9743}, url = {http://www.kky.zcu.cz/en/publications/KanisJ_2005_Automaticlemmatizer}, }