Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A text extraction method, device and electronic equipment for an image document

A document and image technology, applied in the field of image processing, can solve problems such as structural information confusion, achieve the effect of avoiding extraction confusion and increasing recognition ability

Active Publication Date: 2021-03-02
北京智源人工智能研究院
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention provides a text extraction method, device and electronic equipment for image documents, which can effectively solve the problem of structural information confusion caused by existing document extraction methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A text extraction method, device and electronic equipment for an image document
  • A text extraction method, device and electronic equipment for an image document
  • A text extraction method, device and electronic equipment for an image document

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] In order to better understand the above-mentioned technical solution, the above-mentioned technical solution will be described in detail below in conjunction with the accompanying drawings and specific implementation methods.

[0033] refer to figure 1 , in some embodiments, a text extraction method of an image document is provided, comprising:

[0034] Step S101, using an optical character recognition model to recognize the image document;

[0035] Step S102, generating a combined vector according to the identified information;

[0036] Step S103, input the combined vector into the text extraction model to perform text extraction to obtain structured information;

[0037] Step S104, train and optimize the optical character recognition model and the text extraction model according to a joint loss function, the joint loss function includes a loss for image document recognition and a loss for text extraction.

[0038] Specifically, in step S101, the recognition of the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text extraction method, device and electronic equipment of an image document. The method includes: identifying the image document through an optical character recognition model; generating a combination vector according to the identified information; inputting the combination vector into the text extraction model for processing Text extraction to obtain structured information; wherein, the optical character recognition model and the text extraction model are trained and optimized according to a joint loss function, and the joint loss function includes a loss for recognizing image documents and a loss for text extraction. This method can effectively solve the problem of structural information confusion caused by existing document extraction methods.

Description

technical field [0001] The present invention relates to the technical field of image processing, in particular to a text extraction method, device and electronic equipment for image documents. Background technique [0002] Document extraction can be divided into two parts: information extraction and document structure understanding. The information extraction technology based on the language model has been developed to a higher level. The more commonly used frameworks include word2vec+BiLSTM+CRF, BERT, GPT, ERNIE and other pre-training models. Large-scale pre-trained language models can effectively capture the semantic information contained in the text through self-supervised tasks in the pre-training stage, and can effectively improve the model effect after fine-tuning downstream tasks. However, the existing pre-trained language models are mainly for a single mode of text, while ignoring the visual structure information of the natural alignment between the document itself ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/32G06K9/62G06N3/04G06N3/08
CPCG06N3/08G06V20/635G06V30/10G06N3/045G06F18/241
Inventor 黄园园钱泓锦刘占亮窦志成
Owner 北京智源人工智能研究院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products