Basic framework of text search engine

Source publication

Calculation process of text frequency cepstrum

Test the comparison of subtitle recognition accuracy of sample

Automatic text location of multimedia video for subtitle frame

Article

Full-text available

Dec 2019

The traditional multi-media video text auto-positioning is too dependent on man-made function, and its disadvantages are mainly embodied in the aspects of strong subjectivity, large quantity of work, slow processing speed and the like. Through the establishment of the basic framework of multimedia video text search, the feature vectors of multimedi...

A novel multi-modal feature extraction system for news video

Conference Paper

Aug 2023

Ruiqi Xue

An end to end system for subtitle text extraction from movie videos

Article

Full-text available

Apr 2022

A new technique for text detection inside a complex graphical background, its extraction, and enhancement to be easily recognized using the optical character recognition (OCR). The technique uses a deep neural network for feature extraction and classifying the text as containing text or not. An error handling and correction (EHC) technique is used to resolve classification errors. A multiple frame integration (MFI) algorithm is introduced to extract the graphical text from its background. Text enhancement is done by adjusting the contrast, minimize noise, and increasing the pixels resolution. A standalone software Component-Off-The-Shelf (COTS) is used to recognize the text characters and qualify the system performance. Generalization for multilingual text is done with the proposed solution. A newly created dataset containing videos with different languages is collected for this purpose to be used as a benchmark. A new HMVGG16 convolutional neural network (CNN) is used for frame classification as text containing or non-text containing, has accuracy equals to 98%. The introduced system weighted average caption extraction accuracy equals to 96.15%. The correctly detected characters (CDC) average recognition accuracy using the Abbyy SDK OCR engine equals 97.75%.

Basic framework of text search engine

Citations