Text line extraction method and device
An extraction method and text line technology, applied in the field of image processing, can solve the problems of low extraction efficiency, affecting the text line extraction effect and extraction efficiency, and difficulty in adaptation.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
no. 1 example
[0110] see figure 1 , a schematic flow chart of a text line extraction method provided in this embodiment, the method includes the following steps:
[0111] S101: By detecting the characters in the document image, each candidate text box containing the characters is formed.
[0112] It should be noted that this embodiment does not limit the way of obtaining document images. For example, a document image may be a document in which a user converts a paper document into an image format by scanning or taking a photo. This embodiment does not limit the document image The language of the character, for example, it can be Chinese, English and other characters.
[0113] After obtaining the document image to be detected, firstly, the characters in the document image can be detected by using existing or future character detection algorithms, so as to extract each candidate text box containing characters in the document image, wherein the candidate text The frame refers to the approxim...
no. 2 example
[0140] It should be noted that this embodiment will introduce a specific implementation manner of step S1021 in the first embodiment.
[0141]In this embodiment, after each candidate text frame of the document image is formed through step S101 in the first embodiment, each candidate text frame can be connected to one or more adjacent candidate text frames through undirected connecting lines Connect to form an undirected graph. It should be noted that an undirected connection line between every two candidate text boxes in the undirected graph corresponds to a weight value, and the weight value will be represented by a distance metric value . It should be noted that in the follow-up content, this embodiment will use a certain candidate text box in the document image as the standard to introduce how to connect the candidate text box with adjacent candidate text boxes through undirected connecting lines, and The connection methods of other candidate text boxes are similar and wil...
no. 3 example
[0206] It should be noted that this embodiment will introduce two specific implementation manners of step S1022 in the first embodiment.
[0207] In a first alternative implementation, see Figure 6 , which shows one of the schematic flowcharts for forming one or more target text regions by breaking at least one of the connecting lines between the candidate text boxes provided by this embodiment, the process includes the following steps :
[0208] S601: Find N candidate text boxes on the leftmost side in the document image, where N≥1.
[0209] In this embodiment, after step S102, each candidate text frame is connected with at least one adjacent candidate text frame through an undirected connection line to construct an undirected graph, such as Figure 5 The undirected graph shown in , when using the minimum spanning tree algorithm to generate the undirected graph, the entire undirected graph corresponds to a complete tree. From Figure 5 It can be seen that most of the adj...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com