Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Duplicate removal counting method and device

A technology of deduplication counting and equipment, which is applied in computing, electrical digital data processing, special data processing applications, etc., and can solve problems such as inaccessibility

Active Publication Date: 2016-10-05
TAOBAO CHINA SOFTWARE
View PDF4 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0025] Example 1 If you use an approximate streaming deduplication counting scheme in each time period, such as Linear, HyperLogLog, HyperLogLog++, etc., in scenarios where the accuracy is not high, you can simplify the cache and use the Bloom filter (BloomFilter ) etc. to approximately determine whether the history cache is hit, and use approximate cache + one-dimensional array to perform approximate streaming deduplication counting. DAY2~DAY3, DAY1~DAY3 combined deduplication counting of several adjacent time periods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Duplicate removal counting method and device
  • Duplicate removal counting method and device
  • Duplicate removal counting method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0088] In a typical configuration of the present application, the terminal, the device serving the network and the trusted party all include one or more processors (CPUs), input / output interfaces, network interfaces and memory.

[0089] Memory may include non-permanent storage in computer-readable media, in the form of random access memory (RAM) and / or nonvolatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer readable media.

[0090] Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information. Information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a duplicate removal counting method and device. According to the method and device, a data number of data occurring in a time period and an increment of the data number of the data occurring in the time period relative to a previous time period can be quickly obtained according to a record of the time period in which the data occurs previously and the data in the current time period, so that various data statistics can be quickly carried out subsequently based on the data number and / or the increment of the data number. In addition, combined data in a plurality of adjacent time periods can be subjected to accurate duplicate removal counting. Furthermore, the data number in the time period and the increment of the data number in the time period relative to the previous time period can be quickly and accurately obtained and recorded in a stream manner, only the time period in which the data occurs previously needs to be recorded and updated, and a historical detail of occurrence of the data in each time period does not need to be recorded, so that the data storage capacity is reduced.

Description

technical field [0001] The present application relates to the fields of communications and computers, and in particular to a deduplication counting method and device. Background technique [0002] Internet applications / websites often need to count some data, such as counting, deduplication counting, summing, and averaging. PV (PageView, visits), that is, page views or clicks, is calculated every time the user refreshes. UV (UniqueVisitor, unique visitor), a computer client that visits a certain website is a visitor, and the same client within 00:00-24:00 will only be counted once. The statistics of website PV and UV belong to counting / deduplication counting. The application scenarios of de-duplication counting include: the number of de-duplication buyers on e-commerce websites, the number of sellers who made transactions on Double Eleven, etc. [0003] Take the de-duplication counting of unique visitors in each continuous time period of a website as an example (hereinafte...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 胡四海
Owner TAOBAO CHINA SOFTWARE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products