Figure - available from: Journal of Ambient Intelligence and Humanized Computing
This content is subject to copyright. Terms and conditions apply.
Basic framework of text search engine

Basic framework of text search engine

Source publication
Article
Full-text available
The traditional multi-media video text auto-positioning is too dependent on man-made function, and its disadvantages are mainly embodied in the aspects of strong subjectivity, large quantity of work, slow processing speed and the like. Through the establishment of the basic framework of multimedia video text search, the feature vectors of multimedi...

Citations

Article
Full-text available
A new technique for text detection inside a complex graphical background, its extraction, and enhancement to be easily recognized using the optical character recognition (OCR). The technique uses a deep neural network for feature extraction and classifying the text as containing text or not. An error handling and correction (EHC) technique is used to resolve classification errors. A multiple frame integration (MFI) algorithm is introduced to extract the graphical text from its background. Text enhancement is done by adjusting the contrast, minimize noise, and increasing the pixels resolution. A standalone software Component-Off-The-Shelf (COTS) is used to recognize the text characters and qualify the system performance. Generalization for multilingual text is done with the proposed solution. A newly created dataset containing videos with different languages is collected for this purpose to be used as a benchmark. A new HMVGG16 convolutional neural network (CNN) is used for frame classification as text containing or non-text containing, has accuracy equals to 98%. The introduced system weighted average caption extraction accuracy equals to 96.15%. The correctly detected characters (CDC) average recognition accuracy using the Abbyy SDK OCR engine equals 97.75%.