Please use this identifier to cite or link to this item:
http://localhost:80/xmlui/handle/123456789/4755
Title: | EXTRACTION OF CAPTION AND ANIMATED TEXT FROM VIDEOS/IMAGES |
Authors: | Tehsin, Samabia |
Keywords: | Technology |
Issue Date: | 2014 |
Publisher: | National University of Sciences and Technology, Islamabad |
Abstract: | Textual information embedded in multimedia can provide a vital tool for indexing and retrieval. Text extraction process has a lot of inherent problems due to the variation in font sizes, color, backgrounds and resolution. Text detection, localization and tracking are the most challenging phases of the text extraction process whereas text extraction results are highly dependent upon these phases. This dissertation focuses on the text detection, localization and tracking because of their very fundamental importance. A bio-inspired text detection, localization and tracking is developed and presented in the dissertation. Anthropocentric approach of text detection is studied and is mathematically modeled to design a text extraction process. A novel text segmentation method is proposed covering huge range of text scales, colors and font styles. Segmentation procedure consists of adopted K-means clustering and a fuzzy based perceptual merging process. Two effectual feature vectors are introduced for the classification of the text and non-text objects. First feature vector is based upon the human text detection system and is mathematically represented by the Radon transform of the text candidate objects. Second feature vector is derived from the detailed geometrical analysis of the text contents. Union of two feature vectors is used for the classification of text and non-text objects using Support vector machine (SVM). Fuzzy based text tracking mechanism is also introduced in the research that can handle static as well as dynamic text appearing in videos. The dynamic text includes the simple animations like vertical and horizontal scrolling, as well as the complex ones like random movement, scale change and zoom in/out. ii Text detection and localization results are evaluated on three publicly available datasets namely ICDAR 2011, ICDAR 2013 and IPC-Artificial text. Moreover, results are compared with state of the art techniques. Comparison demonstrates the superiority of the presented research. Text tracking dataset is also developed and proposed tracking algorithm is tested on the dataset that demonstrates the applicability of the proposed tracking technique. |
URI: | http://142.54.178.187:9060/xmlui/handle/123456789/4755 |
Appears in Collections: | Thesis |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.