Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for compressing, decompressing and querying documents

A compression method and document technology, applied in file systems, instruments, computing, etc., can solve problems such as inability to compress XML documents

Active Publication Date: 2016-03-30
NEW FOUNDER HLDG DEV LLC +2
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The embodiment of the present invention provides a method for compressing XML documents with corresponding Schemas, which is used to solve the problem that XML documents with corresponding Schemas cannot be compressed.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for compressing, decompressing and querying documents
  • Method and device for compressing, decompressing and querying documents
  • Method and device for compressing, decompressing and querying documents

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0076] Embodiment 1 provides a method for compressing an XML document with a corresponding Schema. The method first separates the structural content and data content of the XML document; secondly, respectively determines the path code of the node and the path code of the data content; finally, the The path encoding of the node, the path encoding of the data content, and the data content are respectively compressed; the specific steps are as follows:

[0077] Step A, separate the structural content and data content of the XML document according to a preset separation method; the structural content is other content in the tags in the XML document except attribute values ​​and the content between tags; the data content Including the attribute values ​​in the tags in the XML document and the content between the tags;

[0078] Step B, assign preset node numbers to the site, regions, namedrica, item, id, location, categories, and category nodes in the Schema node; the node numbers c...

Embodiment 2

[0100] Embodiment 2 of the present invention provides a method for decompressing an XML document compressed using the above compression method. After obtaining and decompressing the path code of the compressed node, the compressed data content and the compressed document structure information, Output the node corresponding to the decompressed path code; determine the compressed data content corresponding to the decompressed node according to the decompressed document structure information, and the method of decompressing and outputting the determined data content specifically includes the following process:

[0101] Step one, judge whether described XML document has binary Schema graph (Binaryschemagraph, bsg) file, when confirming, turn to step two; In the described bsg file, include the node name of Schema node, the child node number of this node, the The type of node, the type includes data content type and node type; the type of the indicator of the node, the number of occu...

Embodiment 3

[0107] The third embodiment provides a method for querying the compressed XML document using the above-mentioned compression method, because the node and / or storage path of the data content in the XML document corresponding to the Schema after compression according to the above-mentioned method is a path code; Therefore, after the query path is converted into a path code, the nodes and / or data content can be searched, and the found nodes and / or data content can be output as the query result; in this step, the XML document comes with a bsg file, The bsg file includes the node name of the Schema node, the number of child nodes of the Schema node, the type of the Schema node, and the type includes a data content type and a node type; the type of the indicator of the Schema node, the indicator The number of occurrences, structure content and data content; the type of the indicator includes all indicator, choice indicator and sequence indicator; the bsg file also includes a start ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of computer applications, in particular to a method and device for compressing, decompressing and querying documents, which are used to solve how to improve the efficiency of compressing XML documents through Schema; the method includes: separating the structural content of XML documents and Data content; the structure content is the content other than the attribute value and the content between the tags in the tags in the XML document; determine the path code of the nodes in the structure content; data content; the path coding of the node identifies the storage location of the node in the structure content through the node and other nodes in the structure content; The encoding process is performed, and the processed node, the path encoding of the node and the data content are respectively compressed. It can be seen that this method can improve the efficiency of compressing XML documents through Schema.

Description

technical field [0001] The invention relates to the field of computer application technology, in particular to a method and device for compressing, decompressing and querying documents. Background technique [0002] Extensible Markup Language (XML), as a common data storage language, has been widely used. Because there is a large amount of data redundancy in XML documents, people usually use a special XML compression method to perform data compression on XML documents during use. Commonly used XML compression methods are mainly divided into two types: [0003] The first one is a compression method that does not support query; if you want to query and get part of the XML data from the XML document compressed by this method, you need to decompress the entire XML document first, then perform the query, and obtain the queried result; [0004] The second is an XML compression method that supports query; this method supports querying and obtaining part of XML data directly from...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCH03M7/707G06F16/3331G06F16/10
Inventor 仇睿恒胡薇
Owner NEW FOUNDER HLDG DEV LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products