Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Hierarchical tree index-based correlation data compression method

A technology of linked data and tree index, which is applied in the field of data processing, can solve the problem of waste of storage space and achieve the effect of improving the compression rate

Inactive Publication Date: 2018-02-16
WUHAN UNIV OF SCI & TECH
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But K 2 The original design of the structure borrowed by -triple is to compress and store the network graph of the two-dimensional structure. When extracting the predicate to construct a sparse two-dimensional matrix from the RDF triples represented by the three-dimensional matrix of the relation, some storage space will be wasted.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hierarchical tree index-based correlation data compression method
  • Hierarchical tree index-based correlation data compression method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] The present invention will be described in detail below in conjunction with the accompanying drawings. As a part of this description, the principle of the present invention will be described through embodiments. Other aspects, features and advantages of the present invention will become clear at a glance through the detailed description. In the referenced drawings, the same reference numerals are used for the same or similar components in different drawings.

[0022] Such as figure 1 and figure 2 As shown, a method for compressing associated data based on a hierarchical tree index provided by an embodiment of the present invention includes the following steps:

[0023] Build a dictionary file, process the RDF triple set by means of dictionary mapping construction, build a dictionary through the content of the RDF triple, and assign a unique ID to each different identifier in the data set through grouping and deduplication processing Integer ID, which converts the lon...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a hierarchical tree index-based correlation data compression method. The method comprises the following steps of constructing a dictionary file; establishing a two-dimensionalmatrix tree index part for indexing data blocks; and establishing three-dimensional matrix indexes of block data and corresponding ID triple serialization data. According to the hierarchical tree index-based correlation data compression method, a relational three-dimensional matrix is subjected to block segmentation in combination with the relational three-dimensional matrix and predicate vector distribution; through upward mapping of the relational three-dimensional matrix, the indexes in a two-dimensional matrix form are established for the block data; and compression storage is performed oneach block, so that the compression rate is remarkably increased on the basis of not destroying an original structure.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a method for compressing associated data based on a hierarchical tree index. Background technique [0002] At present, the compression of linked data at the serialization level mainly has two ideas based on character length processing and grammatical structure processing. The idea of ​​compression based on character length is mainly to process the average representation length of URI identifiers in the RDF data model, so as to realize the compression of the entire data set. At present, the relatively mature solutions in this regard are based on the idea of ​​dictionary to map the URI strings or constants that constitute the subject, predicate and object of RDF triples, and map repeated URI characters to unique integers ID identification, forming a data set composed of ID triples corresponding to RDF triples one by one, thus significantly reducing redundant information ca...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/2246G06F16/1744
Inventor 黄莉
Owner WUHAN UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products