Extendible repeated data detection method
A technology of duplicate data and detection method, which is applied in the field of scalable duplicate data detection, and can solve the problem that the storage capacity cannot be expanded efficiently.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0096] For ease of understanding, first explain the unit conversion in the calculation: 1T=10 3 G=10 6 M=10 9 k=10 12
[0097] Suppose we need to detect duplicate data on a server with a capacity of 32T bytes, and the false positive rate is expected to be controlled below 0.005, that is, ε’=0.005. The block size is 8K bytes per block, the Bloom filter group base g=64 (assuming the server word length is 64), and the maximum number of Bloom filters is set to r=128; Bloom filter expansion factor t=4 ;Fingerprint byte number Y=20;
[0098] The present invention will be further described below in conjunction with the accompanying drawings and embodiments.
[0099] Such as image 3 As shown, the embodiment of the present invention includes a block processing step, a fingerprint extraction step, a Bloom filter retrieval step, a fingerprint subset table retrieval step, a less than full Bloom filter judgment step, a new fingerprint marking step, and a Bloom filter The step of ju...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com