Video character end-to-end detection and recognition method based on deep learning

A deep learning and text detection technology, applied in neural learning methods, character and pattern recognition, instruments, etc., can solve problems such as easy interference with recognition results

Active Publication Date: 2021-09-07
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF6 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to provide a method for end-to-end detection and recognition of video text based on deep learning, and to improve the existing two-stage video text detection and recognition method through a shared feature extraction network. When the detection result is inaccurate, it is easy to interfere with the recognition result. problems, while improving the efficiency of network reasoning

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video character end-to-end detection and recognition method based on deep learning
  • Video character end-to-end detection and recognition method based on deep learning
  • Video character end-to-end detection and recognition method based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to make the object, technical solution and advantages of the present invention clearer, the implementation manner of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0027] The existing two-stage method in which text detection and text recognition are separated is prone to recognition errors due to inaccurate detection. Therefore, in the embodiment of the present invention, the text detection method and the text recognition method are integrated, and some calculations are simplified by sharing the feature extraction network. The original image is input, and the feature map can contain a wider range of information, reducing the problem that part of the text line is not detected due to inaccurate detection results, which in turn makes the recognition result wrong.

[0028] see figure 1 , in a possible implementation manner, the method for end-to-end detection and recognition of video text based on d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a video character end-to-end detection and recognition method based on deep learning, and belongs to the technical field of video character processing. The method comprises the following steps: performing image size normalization processing on each video frame image of a video sequence segment to be recognized, so that the pre-processed image size is matched with the input of an end-to-end character detection and recognition network; and sequentially inputting the preprocessed images into the end-to-end character detection and recognition network, and obtaining a character recognition result of the video sequence segment to be recognized. According to the invention, end-to-end detection and recognition of video characters are realized, inherent defects such as error accumulation caused by inconsistent multi-module targets are avoided, and the engineering complexity is reduced. In addition, the network structure is optimized through a shared feature extraction network, and meanwhile, a feature map with a relatively large receptive field is input into the recognition branch of the network, so that the feature map can contain information in a larger range compared with input by using an original map, and the recognition accuracy is improved.

Description

technical field [0001] The invention relates to the technical field of video text processing, in particular to an end-to-end detection and recognition method for video text based on deep learning. Background technique [0002] In recent years, with the digitalization of social informatization and the widespread dissemination of multimedia information, how to extract information from massive videos and images has become an urgent problem to be solved. Video text detection and recognition technology can well meet the needs of a large number of video content extraction and review. Compared with manual work, the use of video text detection and recognition technology for video content extraction and review can greatly improve efficiency and reduce labor costs. [0003] Among them, text detection refers to the use of text detection algorithm to detect the input image, determine whether the input image contains text, if the image contains text, then further locate the position whe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/32G06K9/62G06N3/04G06N3/08
CPCG06N3/084G06N3/044G06N3/045G06F18/241G06F18/214
Inventor 邓建华秦琪怡常为弘俞泉泉何佳霓杨杰李龙代铮郑凯文赵建恒陶泊昊苟晓攀肖正欣余坤陈翔蔡竟业
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products