Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method, memory and device for identifying and positioning underline in text image

A text image, recognition and positioning technology, applied in the field of OCR recognition, can solve the problems of complex convolutional network construction, inapplicability to small scenes, and accuracy impact.

Inactive Publication Date: 2022-06-07
维正知识产权科技有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In some existing technologies, the straight lines in the data are often detected by Hough transform, but some texts will have horizontal lines similar to underlines, which will have a relatively large impact on the accuracy
See attached figure 1 , for example, because the word "check" has a horizontal line at the bottom, it will cause errors in the detection results
[0004] In addition if figure 2 As shown, in the existing technology, there is a method of extracting the underline through a complex convolutional network. Although the accuracy of the extraction method through the convolutional network is high, the construction of the convolutional network is complicated and is not suitable for some small scenarios.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method, memory and device for identifying and positioning underline in text image
  • Method, memory and device for identifying and positioning underline in text image
  • Method, memory and device for identifying and positioning underline in text image

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0050] Embodiment, a kind of detection and positioning method of underline in text, such as image 3 shown, including the following steps:

[0051] S201. Obtain a data picture;

[0052] The acquired profile picture will be binarized so that only black pixels and white pixels exist in the profile picture, where black pixels represent the presence of input values, and white represents the absence of input values.

[0053] S202, use the OCR engine to identify the text in the data picture, and obtain the pixel width of each identified text;

[0054] The OCR engine can be the TesserOCR engine; input the data picture into the TesserOCR engine, and the TesserOCR engine will construct a two-dimensional coordinate system on the data picture. and will output the location coordinate value for each text in the profile picture.

[0055] The position coordinate value of the text is the coordinate value of the outer frame surrounded by the text in a rectangle, and the outer frame coordina...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method, a memory and a device for identifying and positioning an underline in a text image, and relates to the technical field of OCR (Optical Character Recognition). The method is characterized by comprising the following steps: identifying characters in a data picture, and obtaining the pixel width of each identified character; classifying characters with adjacent pixel widths into one class, and calculating a mathematical expected value of each class to obtain a character pixel width array; establishing a beveled accumulator array based on a variable intercept b value and a slope k value; traversing each pixel of the data picture to obtain a beveled accumulator array after pixel coordinates are accumulated; generating unit line segment pixels based on the character pixel width array, and combining the pixel quantity with continuous pixel coordinates in each component of the oblique accumulator array equal to the pixel coordinates of the unit line segment pixels into a line segment representation equation; and combining the line segment representation equations in the beveled accumulator array component to obtain the line segment representation equations of all underlines, and the method has the advantages of high accuracy and suitability for small scenes.

Description

technical field [0001] The invention relates to the technical field of OCR identification, and more particularly, to a method for identifying and locating underlines in text images. Background technique [0002] In the process of image text processing, it is often necessary to identify the underline of the data, and then fill in the required content on the underline. [0003] In some existing technologies, straight lines in the data are often detected by Hough transform, but some characters have horizontal lines similar to underscores, which have a relatively large impact on the accuracy. see attached figure 1 , for example, because the word "check" has a horizontal line at the bottom, it will cause errors in the test results. [0004] In addition, if figure 2 As shown, there is a method of extracting underline through a complex convolutional network in the existing technology. Although the method of extracting through a convolutional network has a high accuracy, the con...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06V30/414G06K9/62G06V10/764
CPCG06F18/241
Inventor 刘落根
Owner 维正知识产权科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products