Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Table recognition method and system fusing multiple text features and geometrical information

A technology of geometric information and recognition methods, applied in the field of image recognition, can solve the problems of not combining all available features, loss of information, and increased difficulty of table recognition

Active Publication Date: 2020-10-30
SHANGHAI JIAO TONG UNIV
View PDF6 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] (1) Many tables will omit the frame lines on both sides of the table for the sake of beauty, and even the popular three-line table only includes two frame lines at the header part and the frame line at the bottom of the table, which proposes a lot for the method of identifying the structure of the table relying on the frame lines big challenge
[0006] (2) The header of some tables contains multiple merged cells to facilitate the identification of data of different categories or time periods, but it increases the difficulty of table identification
[0007] Most of the existing methods only use image information or location information alone, and do not combine all available features, losing the original information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Table recognition method and system fusing multiple text features and geometrical information
  • Table recognition method and system fusing multiple text features and geometrical information
  • Table recognition method and system fusing multiple text features and geometrical information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0077] The present invention will be described in detail below in conjunction with specific embodiments. The following examples will help those skilled in the art to further understand the present invention, but do not limit the present invention in any form. It should be noted that those skilled in the art can make several changes and improvements without departing from the concept of the present invention. These all belong to the protection scope of the present invention.

[0078] According to the present invention, a form recognition method that integrates various text features and geometric information includes:

[0079] Data processing step: obtain the picture of the form area, perform OCR recognition and line recognition on the obtained picture respectively, and obtain key feature information;

[0080] Graph convolutional neural network training steps: According to the obtained key feature information, perform graph convolutional neural network training to build a tabl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a table recognition method fusing multiple text features and geometrical information. The method comprises: a data processing step of obtaining a picture of a table area, and respectively carrying out OCR and straight line recognition on the obtained picture to obtain key feature information; a graph convolutional neural network training step of performing graph convolutional neural network training according to the obtained key feature information, and constructing a table structure recognition model; and a table identification step of performing structure identification on the table in the picture format according to the constructed table structure identification model. The invention provides thea table recognition method fusing the multiple text features and the geometrical information. Improvements are made in the aspects of diversity of adopted data, a method for performing feature extraction on the data and the like, the accuracy of table recognition is effectively improved, a more accurate table structure reconstruction result is obtained, and compared with an existing table recognition mechanism based on traditional rules and a traditional deep learning method based on pictures, the table recognition method has the advantage that the effect is greatly improved.

Description

technical field [0001] The invention relates to the technical field of image recognition, in particular to a table recognition method and system that integrates various text features and geometric information. Background technique [0002] In the information age, how to quickly obtain information and extract key knowledge from massive and complex information is an important issue. As a form of structured data, tables have the characteristics of simplicity and standardization. For users, due to its standardization, information query and comparison are relatively simple; for computers, once the digital table structure is provided, the required data can also be quickly extracted. However, many tables are packaged in image format when published, thus losing the structural information. Therefore, how to re-identify the table structure from the table in image format becomes an important issue. [0003] Existing form recognition technologies include traditional rule-based method...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/00G06K9/32G06N3/04
CPCG06V30/412G06V30/413G06V20/62G06V30/10G06N3/045
Inventor 李一仁黄征周异陈凯
Owner SHANGHAI JIAO TONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products