Skip to content

Detail of publication

Citation

Lukáš Bureš and Petr Neduchal and Miroslav Hlaváč and Marek Hrúz : Generation of Synthetic Images of Full-Text Documents . 20th International Conference on Speech and Computer, SPECOM 2018, Lecture Notes in Artificial Intelligence, LNAI 11096, p. 68-75, Springer Nature Switzerland AG, 2018.

Additional information


Springer

Abstract

In this paper, we present an algorithm for generating images of full-text documents. Such images can be used to train and evaluate models of optical character recognition. The algorithm is modular, individual parts can be changed and tweaked to generate desired images. We describe a method for obtaining background images of paper from already digitalized documents.We use a Variational Autoencoder to train a generative model of these backgrounds enabling the generation of similar background images as the training ones on the fly. The module for printing the text uses large text corpora, font, and suitable positional and brightness noise to obtain believable results. We use Tesseract OCR to compare the real world and generated images and observe that the recognition rate is very similar indicating the proper appearance of the synthetic images. Furthermore, the mistakes made by the OCR system in both cases are alike. Finally, the system generates detailed, structured annotation of the synthesized image.

Detail of publication

Title: Generation of Synthetic Images of Full-Text Documents
Author: Lukáš Bureš ; Petr Neduchal ; Miroslav Hlaváč ; Marek Hrúz
Language: English
Year: 2018
Type of publication: Papers in proceedings of reviewed conferences
Title of journal or book: 20th International Conference on Speech and Computer, SPECOM 2018
Series: Lecture Notes in Artificial Intelligence, LNAI 11096
Page: 68 - 75
DOI: 10.1007/978-3-319-99579-3_8
ISBN: 0302-9743
ISSN: 978-3-319-99578-6
Publisher: Springer Nature Switzerland AG
Date: 18 Sep 2018 - 22 Sep 2018
/ 2019-11-20 17:45:56 /

Keywords

Generating images, Character recognition, Computer vision, Machine learning

BibTeX

@INPROCEEDINGS{LukasBures_2018_Generationof,
 author = {Luk\'{a}\v{s} Bure\v{s} and Petr Neduchal and Miroslav Hlav\'{a}\v{c} and Marek Hr\'{u}z},
 title = {Generation of Synthetic Images of Full-Text Documents},
 year = {2018},
 publisher = {Springer Nature Switzerland AG},
 journal = {20th International Conference on Speech and Computer, SPECOM 2018},
 pages = {68-75},
 series = {Lecture Notes in Artificial Intelligence, LNAI 11096},
 ISBN = {0302-9743},
 ISSN = {978-3-319-99578-6},
 doi = {10.1007/978-3-319-99579-3_8},
 url = {http://www.kky.zcu.cz/en/publications/LukasBures_2018_Generationof},
}