Bill image layout analysis method and device

An analysis method and ticket-like technology, applied in image data processing, graphics and image conversion, details involving image stitching, etc., can solve problems such as large labor costs, many rules, and complex upgrade and maintenance, so as to reduce workload and improve accuracy Sexuality and the effect of improving operating efficiency

Active Publication Date: 2021-11-16
北京玖安天下科技有限公司
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Because the bill images obtained in actual business scenarios usually have many of the above problems, the rule-based method needs to constantly adjust and maintain the set rules, which not only consumes a lot of labor costs, but also causes more and more rules, and each Conflicts between class rules are easy to occur, and the upgrade and maintenance become more and more complicated; at the same time, it will also make it difficult to continue to improve the efficiency of bill image layout recognition

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Bill image layout analysis method and device
  • Bill image layout analysis method and device
  • Bill image layout analysis method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0023] This embodiment provides specific implementation steps of a method for parsing a layout of a bill-like image. As an optional mode, the hardware and software platform used in this embodiment includes: a server with a 3.0G Hz central processing unit, an Nvida 1080GPU processor and a 16G byte memory, and an end-to-end OCR is pre-programmed in the python language. The layout analysis program can complete the recognition of the text box in the bill image and the recognition of the text position and text.

[0024] like figure 1 As shown, the software adopted in this embodiment adopts a method for parsing the image layout of bills, including the following steps:

[0025] s1 prepares the training layout samples for model training and manually marks them. The training layout samples can be preliminarily analyzed by an end-to-end OCR layout analysis program, and then manually labeled, or directly manually labeled. Preferably, the step s1 further includes: adopting a data augme...

Embodiment 2

[0034] This embodiment provides a specific implementation of a bill image layout analysis device, based on the bill image layout analysis method described in Embodiment 1.

[0035] like figure 2 As shown, the bill type image layout analysis device includes:

[0036] The training layout sample labeling module is used to label the training samples; preferably, the training layout sample labeling module is also used to: adopt a data augmentation strategy to perform data augmentation on the training layout samples; wherein, the data augmentation The strategy includes one or more of the following methods: 1) Randomly perturb the coordinate points of the detection frame in the training layout sample; 2) Randomly discard one or more detection frames in the training layout sample; 3) Randomly cut Divide the detection frame, and randomly split the text in the detection frame; 4) randomly replace the text content in the detection frame;

[0037] The text box feature encoding module i...

Embodiment 3

[0042] This embodiment provides a specific implementation manner of an electronic device, based on the method for parsing the image layout of a receipt described in Embodiment 1.

[0043] like Figure 4 As shown, the electronic device includes: a processor (processor) 401, a communication interface (Communications Interface) 402, a memory (memory) 403 and a communication bus 404, wherein the processor 401, the communication interface 402, and the memory 403 pass through the communication bus 404 Complete mutual communication. The processor 401 can call a computer program stored in the memory 403 and runnable on the processor 401 to execute the methods provided by the above-mentioned embodiments, for example, including: preparing training layout samples for model training, and assisting in manual labeling; Carry out feature encoding to the text box in the training layout sample; Carry out feature splicing with the coordinate feature of described text box and text feature, form...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A bill image layout analysis method comprises the following steps: preparing a training layout sample for model training, and carrying out manual annotation; performing feature coding on a textbox in the training layout sample; performing feature splicing on the coordinate features of the textbox and text features to form splicing features of the textbox; splicing the splicing features of a plurality of candidate frames in a training layout sample to form a feature sequence vector of the training layout sample; training the model to obtain a layout analysis model; and inputting the feature sequence vector of the to-be-analyzed layout into the layout analysis model to obtain an analysis result of the to-be-analyzed layout. According to the method, a machine learning method is adopted, the end-to-end training and processing process is achieved, compared with a traditional feature engineering method, the manual workload is greatly reduced, meanwhile, the operation efficiency of a model framework is improved, and the accuracy of bill image layout analysis is remarkably improved.

Description

technical field [0001] The invention belongs to the technical field of computer applications, and relates to an image recognition processing method and device, in particular to a bill image layout analysis method and device. Background technique [0002] In daily life, there are a large number of bill images that need structured storage. Such as ID cards, VAT invoices, train tickets, air tickets, etc. In the process of electronicizing these bill images, due to factors such as personnel, equipment, and scenes, various problems occur in the electronic images, such as tilting, occlusion, blurring, and reflections. This will bring two challenges to the subsequent data structuring, one is the challenge of OCR recognition, and the other is the challenge of layout analysis. At present, the OCR recognition technology is relatively mature, but the layout analysis technology does not have a more effective solution. [0003] Existing layout analysis methods usually use feature engin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/62G06T3/40
CPCG06T3/4038G06T2200/32G06F18/214Y02P90/30
Inventor 丁大强李蒙阳石海涛胡安裕
Owner 北京玖安天下科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products