A Comparative Approach for Overlay Text Detection and Extraction from Complex Video Scene
Abstract
Overlay text brings important semantic clues in
video content analysis such as video information retrieval and
summarization, since the content of the scene or the editors
intention can be well represented by using inserted text. Most of
the previous approaches to extracting overlay text from videos
are based on low-level features, such as edge, color, and texture
information. However, existing methods experience difficulties in
handling texts with various contrasts or inserted in a complex
background. In this paper, we propose a novel framework to
detect and extract the overlay text from the video scene. Based
on our observation that there exist transient colors between
inserted text and its adjacent background, a transition map is first
generated. Then candidate regions are extracted by a reshaping
method and the overlay text regions are determined based on
the occurrence of overlay text in each candidate.
In this paper represents comparative approach between edge and
change in intensity models that are computationally fast and
invariant to basic transformations like horizontal, Vertical and
scaling. We demonstrate that change in intensity context can be
used to detect overlay text regions. Intensity context describes all
boundary points of the shape with respect to any single boundary
point.
Full Text:
PDFReferences
Wonjun Kim, Changick Kim, A New Approach for Overlay Text Detection
and Extraction from Complex Video Scene, in IEEE Trans. on image
processing, Vol. 18, no. 2, Feb.2009.
Lyu, M.R., Song, J. and Cai, M., 2005. A comprehensive method
for multilingual video text detection, localization, and extraction.IEEE
transactions on circuits and systems for video technology,15(2), pp.243-
N.Ostu, A threshold selection method from gray-level histograms, IEEE
Trans. Syst., Man, Cybern, vol.9, no.1, pp.62-66, Mar. 1979.
T. Sato, T. Kanade, E.K. Hughes, and M.A. Smith, Video OCR for digital
news archive, in Proc. IEEE International Workshop on Content-Based
Access of Image and Video Libraries, Jan.1998,pp.52-60.
L. Agnihotri and N. Dimitrova, Text detection for video analysis, in
Proc. IEEE Int. Workshop on Content-Based Access of Image and Video
Libraries, Jun. 1999, pp. 109113.
R. C. Gonzalez and R. E.Woods, Digital Image Processing, 2nd ed. Upper
Saddle River, NJ: Prentice-Hall, 2002.
C. G. M. Snoek and M. Worring, Time interval maximum entropy based
event indexing in soccer video, in Proc. Int. Conf. Multimedia and Expo,
Jul. 2003, vol. 3, pp. 481484.
J. Gllavata, R. Ewerth, and B. Freisleben, Text detection in images based
on unsupervised classification of high-frequency wavelet coefficients, in
Proc. Int. Conf. Pattern Recognition, Aug. 2004, vol. 1, pp. 425428.
J. Cho, S. Jeong, and B. Choi, News video retrieval using automatic
indexing of Korean closed-caption, Lecture Notes in Computer Science,
vol. 2945, pp. 694703, Aug. 2004.
X. S. Hua, P. Yin, and H. J. Zhang, Efficient video text recognition
using multiple frame integration, in Proc. Int. Conf. Image Processing,
Sep. 2004, vol. 2, pp. 2225.
K. C. K. Kim et al., Scene text extraction in natural scene images using
hierarchical feature combining and verification, in Proc. Int. Conf. Pattern
Recognition, Aug. 2004, vol. 2, pp. 679682.
M. R. Lyu, J. Song, and M. Cai, A comprehensive method for multilingual
video text detection, localization, and extraction, IEEE Trans.
Circuit and Systems for Video Technology, vol. 15, no. 2, pp. 243255,
Feb. 2005.
C. Liu, C. Wang, and R. Dai, Text detection in images based on
unsupervised classification of edge-based features, in Proc. Int. Conf.
Document Analysis and Recognition, Sep. 2005, vol. 2, pp. 610614.
X. Liu and J. Samarabandu, Multiscale edge-based text extraction from
complex images, in Proc. Int. Conf. Multimedia and Expo (ICME), Jul.
, pp. 17211724.
M. Bertini, C. Colombo, and A. D. Bimbo, Automatic caption localization
in videos using salient points, in Proc. Int. Conf. Multimedia and
Expo, Aug. 2001, pp. 6871.
J. Wu, S. Qu, Q. Zhuo, and W. Wang, Automatic text detection
in complex color image, in Proc. Int. Conf. Machine Learning and
Cybernetics, Nov. 2002, vol. 3, pp. 11671171.
Y. Liu, H. Lu, X. Xue, and Y. P. Tan, Effective video text detection
using line features, in Proc. Int. Conf. Control, Automation, Robotics
and Vision, Dec. 2004, vol. 2, pp. 15281532.
T. Ojala, M. Pierikainen, and T. Maenpaa, Multiresolution gray-scale and
rotation invariant texture classification with local binary patterns, IEEE
Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, pp. 971987, Jul. 2002.
S. Antani, D. Crandall, and R. Kasturi, Robust extraction of text in
video, in Proc. Int. Conf. Pattern Recognition, Sep. 2000, vol. 1, pp.
J. M. Pike and C. G. Harris, A combined corner and edge detector, in
Proc. Alvey Vision Conf., 1988, pp. 147151.
S. U. Lee, S. Y. Chung, and R. H. Park, A comparative performance
study of several global thresholding techniques for segmentation, Comput.
Vis., Graph., Image Process., vol. 52, pp. 171190, 1990.
R. Lienhart and Axel Wernicke, ”Localizing and Segmenting Text in
Images and Videos”, IEEE Trans. on Circuits and Systems for Video
Technology, Vol. 12, No.4. pp.256-268, April 2002.
Y. Zhong, K. Karu, and A. X. Jain,” Locating text in complex color
images”, Pattern Recognition, 28(10): 1523-1535, 1995.
H. Li, D. Doermann, and O. Kia, ”Automatic text detection and tracking
in digital video,” IEEE Trans. on Image Processing, vol. 9, pp. 147-156,
Jan. 2000.
Min Cai, Jiqiang Song, and M.R. Lyu, ”A new approach for video
text detection”, IEEE Conf. on Image Processing (ICIP’2002), vol. I,
pp. 11117 -11120. 2002.
Sandip Foundation's International Journal on Emerging Trends in Technology (IJETT)
IJETT | ISSN: 2455 – 0124 (Online) | 2350 – 0808 (Print) | GIF : 0.456 | April 2017 | Volume 4 | Issue 1 | 7093
Refbacks
- There are currently no refbacks.
Copyright © IJETT, International Journal on Emerging Trends in Technology