Information extraction method and device

A technology of information extraction and key information, applied in the field of data processing, can solve problems such as inaccuracy and incomplete results of key information

Pending Publication Date: 2022-01-21
BEIJING KINGSOFT DIGITAL ENTERTAINMENT CO LTD
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Existing information extraction methods are mostly information extraction based on plain text, but for documents such as pdf, there are often complex and diverse formats such as columns, blocks, and nested tables in the document. If you only rely on plain text-based Information extraction will lead to incomplete and inaccurate results of key information extracted

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Information extraction method and device
  • Information extraction method and device
  • Information extraction method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In the following description, numerous specific details are set forth in order to provide a thorough understanding of the application. However, the present application can be implemented in many other ways different from those described here, and those skilled in the art can make similar promotions without violating the connotation of the present application. Therefore, the present application is not limited by the specific implementation disclosed below.

[0031] Terms used in one or more embodiments of the present application are for the purpose of describing specific embodiments only, and are not intended to limit the one or more embodiments of the present application. As used in one or more embodiments of this application and the appended claims, the singular forms "a", "the", and "the" are also intended to include the plural forms unless the context clearly dictates otherwise. It should also be understood that the term "and / or" used in one or more embodiments of th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an information extraction method and device. The information extraction method comprises the following steps: obtaining a to-be-processed document; identifying a text and a table in the to-be-processed document; performing key information identification on the text based on a keyword identification algorithm, and determining document text key information; analyzing characters of cells in the table to obtain key information of the document table; fusing the document text key information and the document table key information, and determining an information extraction result of the to-be-processed document. According to the scheme, not only can the key information of a document text be extracted, but also the key information of a document table can be extracted; in a same to-be-processed document, document text key information and document table key information may be associated, and individual document text key information or individual document table key information may be incomplete, so that the document text key information and the document table key information can be fused to an obtain information extraction result of the to-be-processed document, the information extraction efficiency of the to-be-processed document is improved, and the integrity and accuracy of information extraction can be improved.

Description

technical field [0001] The present application relates to data processing technology in the technical field of artificial intelligence, and in particular to an information extraction method. The present application also relates to an information extraction device, a computing device, and a computer-readable storage medium. Background technique [0002] Artificial intelligence (AI) refers to the ability of an engineered (that is, designed and manufactured) system to perceive the environment, as well as the ability to acquire, process, apply, and represent knowledge. Natural language processing refers to the use of computers to process information such as shape, sound, and meaning of natural language, that is, the operation and processing of input, output, recognition, analysis, understanding, and generation of words, words, sentences, and texts. Realizing information exchange between human and computer is an important issue of common concern in artificial intelligence, compu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/332G06F40/279
CPCG06F16/3329G06F40/279
Inventor 弓源李长亮
Owner BEIJING KINGSOFT DIGITAL ENTERTAINMENT CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products