Text extraction method and device for image document and electronic equipment

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A document and image technology, applied in the field of image processing, can solve problems such as structural information confusion, avoid confusion in extraction, and increase recognition ability.

Active Publication Date: 2020-12-04

北京智源人工智能研究院

View PDF5 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] The present invention provides a text extraction method, device and electronic equipment for image documents, which can effectively solve the problem of structural information confusion caused by existing document extraction methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0032] In order to better understand the above-mentioned technical solution, the above-mentioned technical solution will be described in detail below in conjunction with the accompanying drawings and specific implementation methods.

[0033] refer to figure 1 , in some embodiments, a text extraction method of an image document is provided, comprising:

[0034] Step S101, using an optical character recognition model to recognize the image document;

[0035] Step S102, generating a combined vector according to the identified information;

[0036] Step S103, input the combined vector into the text extraction model to perform text extraction to obtain structured information;

[0037] Step S104, train and optimize the optical character recognition model and the text extraction model according to a joint loss function, the joint loss function includes a loss for image document recognition and a loss for text extraction.

[0038] Specifically, in step S101, the recognition of the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a text extraction method and device for an image document and an electronic equipment method. The method comprises the steps: recognizing the image document through an opticalcharacter recognition model; generating a combination vector according to the identified information; inputting the combined vector into a text extraction model for text extraction to obtain structured information, wherein the optical character recognition model and the text extraction model are trained and optimized according to a joint loss function, and the joint loss function comprises loss ofimage document recognition and loss of text extraction. The method can effectively solve the problem of structural information chaos caused by an existing document extraction method.

Description

technical field [0001] The present invention relates to the technical field of image processing, in particular to a text extraction method, device and electronic equipment for image documents. Background technique [0002] Document extraction can be divided into two parts: information extraction and document structure understanding. The information extraction technology based on the language model has been developed to a higher level. The more commonly used frameworks include word2vec+BiLSTM+CRF, BERT, GPT, ERNIE and other pre-training models. Large-scale pre-trained language models can effectively capture the semantic information contained in the text through self-supervised tasks in the pre-training stage, and can effectively improve the model effect after fine-tuning downstream tasks. However, the existing pre-trained language models are mainly for a single mode of text, while ignoring the visual structure information of the natural alignment between the document itself ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/32G06K9/62G06N3/04G06N3/08

CPCG06N3/08G06V20/635G06V30/10G06N3/045G06F18/241

Inventor 黄园园钱泓锦刘占亮窦志成

Owner 北京智源人工智能研究院

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Text extraction method and device for image document and electronic equipment

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology