File image compressing method based on file image content analyzing and characteristic extracting

A document image and content analysis technology, applied in image coding, image data processing, instruments, etc., can solve the problems of not considering the document image content and structure, the method operation process is not intuitive enough, and the method is complicated, so as to avoid gray scale divergence, The algorithm is simple and practical, and the effect of saving storage space

Inactive Publication Date: 2005-10-26
BEIHANG UNIV
View PDF0 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Zhu Qingsheng, Lin Jie, and Zhang Min proposed a document image compression method based on layer segmentation in "Layer Segmentation-Based Document Image Compression", Computer Engineering and Design. 2004, Vol.25 No.8. It also does not consider the content and structural features of the document image, but uses multi-scale 2-color clustering to segment the document image layer, divides the image into foreground image layer, background image layer and mark image layer and compresses them respectively. This method needs to calculate the gray value of each pixel to determine which layer it belongs to, which requires a lot of calculations. When the image is segmented, it is divided into blocks of different sizes multiple times, and multiple iterative operations are also required; the method is more complicated. The method operation process is not intuitive enough

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • File image compressing method based on file image content analyzing and characteristic extracting
  • File image compressing method based on file image content analyzing and characteristic extracting
  • File image compressing method based on file image content analyzing and characteristic extracting

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0014] Such as figure 1 As shown, the present invention is composed of document image preprocessing, document image segmentation, text part compression and image part compression steps. The document image preprocessing performs content analysis on the original document image, and extracts the feature information of the document image, including text, image The location information and pixel gray value information of the mark attached to the document, etc. The document image segmentation link divides the original document image into text part and image part according to the feature information extracted in the preprocessing link, and then compresses the text part and image part respectively through the text compression and image compression links, and the compression result is used as the compression of the original document image result.

[0015] figure 2 Taking a document image as an example, the grayscale projection curve of the image boundary. figure 2 The abscissa is ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention is a document picture compressing method based on analysis and characteristic extraction of the content of a document picture, comprising the steps of preprocessing document picture, cutting document picture, compressing character and compressing picture; where the first step accounts the gray value of the document picture and projects to picture boundary, and according to the edge variation of a projection curve and block diagram of gray values, analyzing the content of the document picture and automatically detecting and extracting the characteristic information of the document picture, including character height, picture boundary, attached mark position, pixel gray value, etc.; the second step cuts the document picture into character and picture parts according to the preprocessed result but the color information of the document picture is not influenced; the third step makes gray transform and travel coding on the character part for compressing; the last step makes the damaging compression on the picture part based on discrete cosine transform (DCT).

Description

technical field [0001] The invention relates to a document image compression method, in particular to a document image compression method based on document image content analysis and feature extraction. Background technique [0002] With the development of Internet and digital storage technology, as a substitute for paper documents, document images have been widely used in digital libraries, e-banking, e-government and other industries. At present, some websites already provide digital books, but generally they are scanned paper documents into images, and only a few of them use standard algorithms such as JPEG and JPEG2000 to compress the scanned images, but these algorithms are used for all parts of the image. The same compression technology, this compression method does not achieve good results for document image compression. At the same time, given that text strokes are oriented to human vision, compared with image distortion, human eyes are more likely to detect blur an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06T9/00
Inventor 常青佟雨兵张其善吴鑫山吴今培王立军杨东凯冦艳红
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products