Data deduplication statistics method and device based on dynamic time window

A technology of dynamic time and statistical methods, applied in the field of data processing, can solve the problems that cannot be applied to highly flexible and high-precision scenarios, and the requirements of statistical accuracy are not high, so as to achieve the effect of improving accuracy and overcoming low accuracy.

Active Publication Date: 2021-09-07
ZHEJIANG KOUBEI NETWORK TECH CO LTD
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] Therefore, the above solution is only suitable for low-accuracy statistics and can tolerate fixed time windows (that is, the start time and deadline are fixed and cannot be changed) or non-accurate deduplication statistics scenarios, and cannot be applied to highly flexible scenarios. , high-precision scene

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data deduplication statistics method and device based on dynamic time window
  • Data deduplication statistics method and device based on dynamic time window
  • Data deduplication statistics method and device based on dynamic time window

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052]Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0053] figure 1 A schematic flowchart of a data deduplication and statistics method based on a dynamic time window according to an embodiment of the present invention is shown. like figure 1 As shown, the method includes the following steps:

[0054] Step S100 , according to the data generation time of the real-time data with a specific field, modify the statistical values ​​corresponding to multiple time granularities associated ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data deduplication statistics method and device based on a dynamic time window. Among them, the method includes: according to the data generation time of the real-time data with a specific field, modifying in real time the statistical values ​​corresponding to multiple time granularities associated with the data generation time of the real-time data; receiving a deduplication statistics request with a dynamic time window, Query the statistical values ​​corresponding to multiple time granularities covering the dynamic time window; where the start time of the dynamic time window is any specified time and the ending time is the current time; according to the statistical values ​​corresponding to multiple time granularities covering the dynamic time window, Calculate the deduplication statistical value corresponding to the dynamic time window, realize real-time data deduplication statistics, so as to meet the needs of statistical scenarios with high real-time requirements, further improve the accuracy of deduplication statistics, and overcome the existing deduplication statistics. The re-statistic method removes the defect of low accuracy of the re-statistical results, and flexibly sets the length of the dynamic time window, making the statistics more flexible.

Description

technical field [0001] The present invention relates to the technical field of data processing, in particular to a data deduplication and statistics method and device based on a dynamic time window. Background technique [0002] At present, many business scenarios need to count the deduplication statistical value (count distinct) within a certain period of time. For example, the security system may count how many users have logged in on a computer in the last day for security prevention and control; another example, the advertising system may count how many users have visited a certain web page in the last 3 minutes for charging. [0003] At present, the following deduplication schemes are mainly adopted in the existing technology: [0004] Solution 1: In scenarios where the amount of data is not large, the detailed data can be stored by recording the details of each piece of data. When it is necessary to perform deduplication statistics on a certain field in a certain per...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): H04L12/24G06F16/958G06F16/215
CPCH04L41/142
Inventor 窦方钰
Owner ZHEJIANG KOUBEI NETWORK TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products