Degraded document image binaryzation method and system based on deep learning

A document image and deep learning technology, applied in the field of computer vision, can solve problems such as poor model generalization ability, and achieve the effect of strong model generalization ability, good partial noise, and good binarization effect.

Inactive Publication Date: 2019-09-20
CHONGQING UNIV +1
View PDF4 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Compared with traditional methods, learning-based methods rely heavily on training data, and after a feature is selected, the generalization ability of the model is poor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Degraded document image binaryzation method and system based on deep learning
  • Degraded document image binaryzation method and system based on deep learning
  • Degraded document image binaryzation method and system based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035]Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.

[0036] In the description of the present invention, unless otherwise specified and limited, it should be noted that the terms "installation", "connection" and "connection" should be understood in a broad sense, for example, it can be mechanical connection or electrical connection, or two The internal communication of each element may be directly connected or indirectly connected through an intermediary. Those skilled in the art can understand the specific meanings of the above terms according to specific situations.

[0037] Before the bi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a degraded document image binarization method and system based on deep learning. The system network comprises a first operation module, a second operation module and a binary classifier, the first operation module is used for degrading degraded document images into feature maps with different proportions and gradually reduced resolutions by using a shallow layer network, so that the change of text pixels on different feature levels is predicted. The detailed information of the previous layer is combined in a coarse-to-fine manner by using deconvolution to sequentially reconstruct foreground images. And the second operation module is cascaded with a deep network for secondary training after the shallow network structure, and is connected with a binary classifier at the end of the network structure for distinguishing background noise from foreground characters, so that the final binarization effect is optimized, and the precision and accuracy of binarization of the degraded document are greatly improved.

Description

technical field [0001] The invention relates to the technical field of computer vision, in particular to a method and system for binarizing degraded document images based on deep learning. Background technique [0002] Paper documents (such as books and receipts, etc.) have problems such as blurred characters, background color leakage, ink smearing, and creases as they last longer. For degraded document images, binarization is very complicated. Due to the effects of aging, imperfect storage methods, and imperfect maintenance conditions, the historical records have undergone severe degradation, including uneven strength, complex background, and infiltration. figure 1 Shows some difficult examples of binarization of degraded document images, where it is difficult to distinguish between text and non-text regions. [0003] In order to solve the problem of document degradation, there is currently an image threshold calculation method, which includes a histogram-based global thr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/46G06K9/62
CPCG06V30/40G06V10/454G06F18/24
Inventor 文静唐倩王翊刘学军向秩仪
Owner CHONGQING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products