A method and system for automatically structuring key information of document images
A key information, automatic structure technology, applied in the field of character recognition, can solve the problem of input file type limitation, unable to achieve fully automatic structured output, etc., to reduce interference, improve user experience, and simplify the operation process.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0190] According to a specific embodiment of the present invention, with reference to the accompanying drawings, the automatic structuring method for document image key information of the present invention will be described in detail.
[0191] The invention provides an automatic structuring method for document image key information, comprising the following steps:
[0192] S100: Obtain sample image data of the document;
[0193] S300: Perform orientation correction and inclination correction preprocessing on the sample image;
[0194] S400: Use optical character recognition to recognize the text in the sample image, and organize it into text form by line;
[0195] S500: Preprocess the text to obtain text data in units of text blocks;
[0196] S600: Combine the file data in units of text blocks with the model dictionary of the text segmentation model, convert each text block into a number sequence, and obtain the mask sequence, segment sequence and label sequence correspondin...
Embodiment 2
[0201] According to a specific embodiment of the present invention, with reference to the accompanying drawings, the automatic structuring method for document image key information of the present invention will be described in detail.
[0202] The invention provides an automatic structuring method for document image key information, comprising the following steps:
[0203] S100: Obtain sample image data of the document; Step S100 includes the following steps:
[0204] S101: Read file data of files in multiple file formats;
[0205] S102: By setting the ID of each page of file data in the file, the file is divided into single pages, and then each single page is converted into image data.
[0206] S200, load a general text recognition model, a text segmentation model, a text classification model, a text structure extraction model and their configuration files, which are respectively used for text recognition, text segmentation, text classification and text structure extraction;...
Embodiment 3
[0215] According to a specific embodiment of the present invention, with reference to the accompanying drawings, the automatic structuring method for document image key information of the present invention will be described in detail.
[0216] The invention provides an automatic structuring method for document image key information, comprising the following steps:
[0217] S100: Obtain sample image data of the document; Step S100 includes the following steps:
[0218] S101: Read file data of files in multiple file formats;
[0219] S102: By setting the ID of each page of file data in the file, the file is divided into single pages, and then each single page is converted into image data.
[0220] S200, load a general text recognition model, a text segmentation model, a text classification model, a text structure extraction model and their configuration files, which are respectively used for text recognition, text segmentation, text classification and text structure extraction;...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com