Localization of Text in Photorealistic Images

Research output

Abstract

Detection and localization of text in photorealistic images is a difficult, and not yet completely solved, problem. We propose the approach to solving this problem based on the method of semantic image segmentation. In this interpretation, text characters are treated as objects to be segmented. In this paper proposes the network architecture for text localization, describes the procedure for the formation of the training set, and considers the algorithm for pre-processing images, reducing the amount of processed data and simplifying the segmentation of the object “background”. The network architecture is a modification of well-known DeepLabv3 network and takes into account the specifics of images of text characters. The proposed method is able to determine the location of text characters in the images with acceptable accuracy. Experimental results of assessing the quality of text localization by the IoU criterion (Intersection over Union) showed that the obtained accuracy is sufficient for further text recognition.

Original languageEnglish
Title of host publicationComputational Science and Its Applications – ICCSA 2019
Subtitle of host publicationConference proceedings
EditorsBeniamino Murgante, Osvaldo Gervasi, Elena Stankova, Vladimir Korkhov, Sanjay Misra, Carmelo Torre, Eufemia Tarantino, David Taniar, Ana Maria A.C. Rocha, Bernady O. Apduhan
PublisherSpringer
Pages825-834
ISBN (Print)9783030243043
DOIs
Publication statusPublished - 2019
Event19th International Conference on Computational Science and Its Applications, ICCSA 2019 - Saint Petersburg
Duration: 1 Jul 20194 Jul 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11622 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference19th International Conference on Computational Science and Its Applications, ICCSA 2019
CountryRussian Federation
CitySaint Petersburg
Period1/07/194/07/19

Fingerprint

Network architecture
Image segmentation
Image processing
Semantics
Network Architecture
Text
Image Segmentation
Preprocessing
Union
Segmentation
Intersection
Sufficient
Experimental Results
Character

Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Grishkin, V., Ebral, A., & Iakushkin, O. (2019). Localization of Text in Photorealistic Images. In B. Murgante, O. Gervasi, E. Stankova, V. Korkhov, S. Misra, C. Torre, E. Tarantino, D. Taniar, A. M. A. C. Rocha, ... B. O. Apduhan (Eds.), Computational Science and Its Applications – ICCSA 2019: Conference proceedings (pp. 825-834). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11622 LNCS). Springer. https://doi.org/10.1007/978-3-030-24305-0_63
Grishkin, Valery ; Ebral, Alexander ; Iakushkin, Oleg . / Localization of Text in Photorealistic Images. Computational Science and Its Applications – ICCSA 2019: Conference proceedings. editor / Beniamino Murgante ; Osvaldo Gervasi ; Elena Stankova ; Vladimir Korkhov ; Sanjay Misra ; Carmelo Torre ; Eufemia Tarantino ; David Taniar ; Ana Maria A.C. Rocha ; Bernady O. Apduhan. Springer, 2019. pp. 825-834 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{5bf984879eac4fcc97fb1011547c3340,
title = "Localization of Text in Photorealistic Images",
abstract = "Detection and localization of text in photorealistic images is a difficult, and not yet completely solved, problem. We propose the approach to solving this problem based on the method of semantic image segmentation. In this interpretation, text characters are treated as objects to be segmented. In this paper proposes the network architecture for text localization, describes the procedure for the formation of the training set, and considers the algorithm for pre-processing images, reducing the amount of processed data and simplifying the segmentation of the object “background”. The network architecture is a modification of well-known DeepLabv3 network and takes into account the specifics of images of text characters. The proposed method is able to determine the location of text characters in the images with acceptable accuracy. Experimental results of assessing the quality of text localization by the IoU criterion (Intersection over Union) showed that the obtained accuracy is sufficient for further text recognition.",
keywords = "Convolution neural network, Semantic segmentation, Text localization",
author = "Valery Grishkin and Alexander Ebral and Oleg Iakushkin",
year = "2019",
doi = "10.1007/978-3-030-24305-0_63",
language = "English",
isbn = "9783030243043",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer",
pages = "825--834",
editor = "Beniamino Murgante and Osvaldo Gervasi and Elena Stankova and Vladimir Korkhov and Sanjay Misra and Carmelo Torre and Eufemia Tarantino and David Taniar and Rocha, {Ana Maria A.C.} and Apduhan, {Bernady O.}",
booktitle = "Computational Science and Its Applications – ICCSA 2019",
address = "Germany",

}

Grishkin, V, Ebral, A & Iakushkin, O 2019, Localization of Text in Photorealistic Images. in B Murgante, O Gervasi, E Stankova, V Korkhov, S Misra, C Torre, E Tarantino, D Taniar, AMAC Rocha & BO Apduhan (eds), Computational Science and Its Applications – ICCSA 2019: Conference proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11622 LNCS, Springer, pp. 825-834, Saint Petersburg, 1/07/19. https://doi.org/10.1007/978-3-030-24305-0_63

Localization of Text in Photorealistic Images. / Grishkin, Valery ; Ebral, Alexander; Iakushkin, Oleg .

Computational Science and Its Applications – ICCSA 2019: Conference proceedings. ed. / Beniamino Murgante; Osvaldo Gervasi; Elena Stankova; Vladimir Korkhov; Sanjay Misra; Carmelo Torre; Eufemia Tarantino; David Taniar; Ana Maria A.C. Rocha; Bernady O. Apduhan. Springer, 2019. p. 825-834 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11622 LNCS).

Research output

TY - GEN

T1 - Localization of Text in Photorealistic Images

AU - Grishkin, Valery

AU - Ebral, Alexander

AU - Iakushkin, Oleg

PY - 2019

Y1 - 2019

N2 - Detection and localization of text in photorealistic images is a difficult, and not yet completely solved, problem. We propose the approach to solving this problem based on the method of semantic image segmentation. In this interpretation, text characters are treated as objects to be segmented. In this paper proposes the network architecture for text localization, describes the procedure for the formation of the training set, and considers the algorithm for pre-processing images, reducing the amount of processed data and simplifying the segmentation of the object “background”. The network architecture is a modification of well-known DeepLabv3 network and takes into account the specifics of images of text characters. The proposed method is able to determine the location of text characters in the images with acceptable accuracy. Experimental results of assessing the quality of text localization by the IoU criterion (Intersection over Union) showed that the obtained accuracy is sufficient for further text recognition.

AB - Detection and localization of text in photorealistic images is a difficult, and not yet completely solved, problem. We propose the approach to solving this problem based on the method of semantic image segmentation. In this interpretation, text characters are treated as objects to be segmented. In this paper proposes the network architecture for text localization, describes the procedure for the formation of the training set, and considers the algorithm for pre-processing images, reducing the amount of processed data and simplifying the segmentation of the object “background”. The network architecture is a modification of well-known DeepLabv3 network and takes into account the specifics of images of text characters. The proposed method is able to determine the location of text characters in the images with acceptable accuracy. Experimental results of assessing the quality of text localization by the IoU criterion (Intersection over Union) showed that the obtained accuracy is sufficient for further text recognition.

KW - Convolution neural network

KW - Semantic segmentation

KW - Text localization

UR - http://www.scopus.com/inward/record.url?scp=85068593365&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-24305-0_63

DO - 10.1007/978-3-030-24305-0_63

M3 - Conference contribution

AN - SCOPUS:85068593365

SN - 9783030243043

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 825

EP - 834

BT - Computational Science and Its Applications – ICCSA 2019

A2 - Murgante, Beniamino

A2 - Gervasi, Osvaldo

A2 - Stankova, Elena

A2 - Korkhov, Vladimir

A2 - Misra, Sanjay

A2 - Torre, Carmelo

A2 - Tarantino, Eufemia

A2 - Taniar, David

A2 - Rocha, Ana Maria A.C.

A2 - Apduhan, Bernady O.

PB - Springer

ER -

Grishkin V, Ebral A, Iakushkin O. Localization of Text in Photorealistic Images. In Murgante B, Gervasi O, Stankova E, Korkhov V, Misra S, Torre C, Tarantino E, Taniar D, Rocha AMAC, Apduhan BO, editors, Computational Science and Its Applications – ICCSA 2019: Conference proceedings. Springer. 2019. p. 825-834. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-24305-0_63