Cuckoo filter-based duplicate removal method and system for large-data-volume key

A large data volume, cuckoo technology, which is applied to encryption devices with shift registers/memory and key distribution, can solve the problems of high-efficiency deduplication and deduplication methods for large data volume keys, and achieve efficient and accurate deduplication. Achieve deduplication query efficiency, improve quality and usability, and efficiently query the effect

Active Publication Date: 2022-08-02
ZHEJIANG QUANTUM TECH CO LTD
View PDF13 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The purpose of the present invention is to provide a method and system for deduplication of keys with a large amount of data based on a cuckoo filter, so as to solve the problem that the field of quantum information security is not applicable to the efficient deduplication of keys with a large amount of data and the existing deduplication methods are not suitable. Technical Defects in Realizing Efficient and Accurate Deduplication

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cuckoo filter-based duplicate removal method and system for large-data-volume key
  • Cuckoo filter-based duplicate removal method and system for large-data-volume key
  • Cuckoo filter-based duplicate removal method and system for large-data-volume key

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 2

[0072] Embodiment 2: as figure 2 , image 3 , Figure 4 , Figure 5 , Image 6 As shown, the present invention also provides a large data volume key deduplication system based on a cuckoo filter, including the following components:

[0073] Deduplication system initialization module 201: used to create storage units and cuckoo filters according to the input parameters, such as image 3 As shown, the deduplication system initialization module 201 includes the following sub-modules:

[0074] (1) Create a storage unit submodule 2011: used to obtain the corresponding storage weight wt according to different hardware parameters according to the total amount of target keys S and the number of storage units N preset by the system, and the weights of each storage unit and is 1, and the expected storage capacity of a single persistent storage unit n = S * wt. Immediately create N database tables or N files;

[0075] (2) Create a cuckoo filter sub-module 2012: acco...

Embodiment 3

[0087] Example three: as Figure 7 shown, on the basis of Embodiment 1, combined with Figure 7 Detailing the process of step S6 positive data traversal statistics, including S601, S602, S603, S604, S605, S606 and other sub-steps, as follows:

[0088] S601: Traverse and retrieve a set of keys X in the specified storage unit;

[0089] S602: Determine whether the key X already exists in the HashSet set of positive data output in step S4 or the HashSet set of overflow data. If it does not exist, it indicates that the key X is unique and does not need to be processed, jump to step S601 to start the next round of traversal statistics , if it exists, it will be processed by S603;

[0090] S603: The key X exists in the HashSet set of positive data or the HashSet set of overflow data, indicating that the key X may be repeated, and obtain the actual storage location information of the key X, that is, the file displacement or database of the key X in the storage unit A prim...

Embodiment 4

[0095] Example four: as Figure 8 shown in figure 2 Based on the structure of the key deduplication system provided by the described invention, the present invention also provides a parallel processing framework of the key deduplication system, as follows:

[0096] Deduplication system instance Inst: The deduplication system instance Inst includes N deduplication process instances, that is, N process instances such as the following deduplication process instances Inst1, InstX, and InstN, where N is the number of storage units, that is, The number of cuckoo filters required for the deduplication system.

[0097] Deduplication process instance Inst1: the key deduplication system process instance Inst1 is the first process instance in the parallel process of the key deduplication system, including a storage unit 601, a cuckoo filter 602, a HashSet collection 603 and a HashMap collection 604.

[0098] Deduplication process instance InstX: the key deduplication system...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a duplicate removal method for a large-data-volume key based on a cuckoo filter. Comprising the steps of initialization of a duplicate removal system, acquisition of key data to be subjected to duplicate removal, divided storage of the key data, cuckoo filtering and duplicate removal of the key data, deletion of the key data, traversal statistics of positive data and overflow data sets, accurate duplicate removal of the key data, and completion of accurate duplicate removal of the key data of a large data volume. The invention further provides a duplicate removal system for the large-data-volume key based on the cuckoo filter. Compared with the prior art, dynamic adjustment of the subsequent storage unit is facilitated, and migration of a large amount of key data is reduced; the deduplication efficiency of the whole system on the large-data-volume key data is improved; compared with a Bloom filter, the cuckoo filter has the function of supporting dynamic element adding and deleting, provides searching performance higher than that of a traditional Bloom filter, and occupies smaller space than that of the Bloom filter under the condition of the same low expected misjudgment rate; and meanwhile, the quality and availability of the key are improved.

Description

technical field [0001] The invention relates to the technical field of electrical digital data processing, in particular to a method and system for deduplicating a key with a large amount of data based on a cuckoo filter. Background technique [0002] With the development of quantum key distribution technology, quantum key relay technology, and the continuous development of quantum key applications, high requirements are placed on the storage and use of quantum keys with large amounts of data in applications. Deduplication of keys in the storage process has become a key requirement. Efficient deduplication of keys can more effectively ensure key security and improve key quality. At present, there are some methods to deal with this kind of large amount of data. The Bloom filter algorithm is often used, but the Bloom filter cannot dynamically delete data; the cuckoo filter algorithm is used, although the efficiency in time and space is relatively high, And data can be deleted...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L9/08H04L9/06G06N3/00
CPCH04L9/0855H04L9/0894H04L9/0643G06N3/006
Inventor 於建江郑韶辉董智超
Owner ZHEJIANG QUANTUM TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products