Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A method and system for end-to-end indefinite-length character recognition

A text recognition and text recognition technology, applied in the field of image recognition, can solve the problems of high cost of hardware equipment, not reaching the practical level, slow running speed, etc., to achieve the effect of improving efficiency and accuracy

Inactive Publication Date: 2019-02-15
FOCUS TECH
View PDF4 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

China's research work on OCR technology started relatively late. It was only in the 1970s that the research on the recognition of numbers, English letters and symbols began. In the late 1970s, the research on Chinese character recognition began. In the early stage, many research units launched Chinese OCR products one after another. The early OCR software failed to meet the actual requirements due to various factors such as recognition rate and productization.
At the same time, due to the high cost of hardware equipment and slow running speed, it has not reached a practical level.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and system for end-to-end indefinite-length character recognition
  • A method and system for end-to-end indefinite-length character recognition
  • A method and system for end-to-end indefinite-length character recognition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments.

[0034] refer to figure 1 As shown, the method flow of the embodiment of the present invention, the specific steps are:

[0035] Step 11: Prepare the data set, mainly including general images for text detection and fixed-length and fixed-size images for text recognition, and perform data annotation, including text position and text content. In this embodiment, the data used for text detection mainly uses the VOC2007 data set, a total of 6,000 pictures, and the text in the pictures is marked (segmented and marked with a small text box with a fixed width of 16 pixels and a variable height). Text annotations are stored in xml file format. The data set used for text recognition mainly has two parts: one part is randomly generated by using the Chinese corpus through changes in font, size, grayscale, blur, perspective, stretching, etc., a ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an end-to-end indefinite-length character recognition method and system, which is characterized in that: according to pictures and text character tags in a data set, a text detection model and a text recognition model are trained by using a depth neural network; Through the text detection model, the position of the text in the picture is located. Identify the specific content of the text through the text recognition model; The two models are combined to recognize the characters in the picture and locate the positions of the characters in the picture, so that the characters in the picture can be recognized and the positions of the characters in the picture can be known, which can be applied to the directions of the traditional character recognition and the certificate recognition, and the efficiency and accuracy of the character input can be greatly improved.

Description

technical field [0001] The invention relates to the field of image recognition, in particular to an end-to-end variable-length text recognition method and system. Background technique [0002] The concept of optical text recognition was first proposed by German scientist Tausheck in 1929, and later American scientist Handel also proposed the idea of ​​using technology to recognize text. Casey and Nagy of IBM Corporation were the first to study the recognition of printed Chinese characters. In 1966, they published the first article on Chinese character recognition, using the template matching method to recognize 1000 printed Chinese characters. [0003] As early as the 1960s and 1970s, countries around the world began to conduct research on OCR (Optical Character Recognition, Optical Character Recognition). In the early stages of research, most of the research was on text recognition methods, and the recognized text was only 0 to 9. number. Take Japan, which also has square...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62G06K9/32
CPCG06V20/62G06V30/10G06F18/217G06F18/214
Inventor 吴苛房鹏展
Owner FOCUS TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products