Publications
Detail of publication
Citation
p. 68-75, Springer Nature Switzerland AG, 2018. : Generation of Synthetic Images of Full-Text Documents . 20th International Conference on Speech and Computer, SPECOM 2018, Lecture Notes in Artificial Intelligence, LNAI 11096,
Additional information
Abstract
In this paper, we present an algorithm for generating images of full-text documents. Such images can be used to train and evaluate models of optical character recognition. The algorithm is modular, individual parts can be changed and tweaked to generate desired images. We describe a method for obtaining background images of paper from already digitalized documents.We use a Variational Autoencoder to train a generative model of these backgrounds enabling the generation of similar background images as the training ones on the fly. The module for printing the text uses large text corpora, font, and suitable positional and brightness noise to obtain believable results. We use Tesseract OCR to compare the real world and generated images and observe that the recognition rate is very similar indicating the proper appearance of the synthetic images. Furthermore, the mistakes made by the OCR system in both cases are alike. Finally, the system generates detailed, structured annotation of the synthesized image.
Detail of publication
Title: | Generation of Synthetic Images of Full-Text Documents |
---|---|
Author: | Lukáš Bureš ; Petr Neduchal ; Miroslav Hlaváč ; Marek Hrúz |
Language: | English |
Year: | 2018 |
Type of publication: | Papers in proceedings of reviewed conferences |
Title of journal or book: | 20th International Conference on Speech and Computer, SPECOM 2018 |
Series: | Lecture Notes in Artificial Intelligence, LNAI 11096 |
Page: | 68 - 75 |
DOI: | 10.1007/978-3-319-99579-3_8 |
ISBN: | 0302-9743 |
ISSN: | 978-3-319-99578-6 |
Publisher: | Springer Nature Switzerland AG |
Date: | 18 Sep 2018 - 22 Sep 2018 |
Keywords
Generating images, Character recognition, Computer vision, Machine learning
BibTeX
@INPROCEEDINGS{LukasBures_2018_Generationof, author = {Luk\'{a}\v{s} Bure\v{s} and Petr Neduchal and Miroslav Hlav\'{a}\v{c} and Marek Hr\'{u}z}, title = {Generation of Synthetic Images of Full-Text Documents}, year = {2018}, publisher = {Springer Nature Switzerland AG}, journal = {20th International Conference on Speech and Computer, SPECOM 2018}, pages = {68-75}, series = {Lecture Notes in Artificial Intelligence, LNAI 11096}, ISBN = {0302-9743}, ISSN = {978-3-319-99578-6}, doi = {10.1007/978-3-319-99579-3_8}, url = {http://www.kky.zcu.cz/en/publications/LukasBures_2018_Generationof}, }