Deduplication method and device for longitudinal federation data statistics, terminal equipment and medium

A data statistics and vertical technology, applied in computer security devices, relational databases, database models, etc., can solve the problem of difficulty in guaranteeing the overall efficiency of statistical data deduplication processing, and meet the requirements of data consistency, strong scalability, The effect of improving overall efficiency

Pending Publication Date: 2021-02-26
WEBANK (CHINA)
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The main purpose of the present invention is to provide a deduplication method, device, te...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deduplication method and device for longitudinal federation data statistics, terminal equipment and medium
  • Deduplication method and device for longitudinal federation data statistics, terminal equipment and medium
  • Deduplication method and device for longitudinal federation data statistics, terminal equipment and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050]It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

[0051]Such asfigure 1 As shown,figure 1 It is a schematic structural diagram of the hardware operating environment of the terminal device involved in the scheme of the embodiment of the present invention.

[0052]It should be noted,figure 1 That can be a structural diagram of the hardware operating environment of the terminal device. The terminal device in the embodiment of the present invention may be a terminal device such as a PC and a portable computer.

[0053]Such asfigure 1 As shown, the terminal device may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, and a communication bus 1002. Among them, the communication bus 1002 is used to implement connection and communication between these components. The user interface 1003 may include a display screen (Display) and a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a deduplication method and device for longitudinal federation data statistics, terminal equipment and a storage medium. The method comprises the steps: receiving a result matrix transmitted by other participants in a longitudinal federation through any participant in the longitudinal federation, wherein the result matrix is obtained by multiplying a first feature matrix bya preset random matrix after other participants locally construct the first feature matrix based on first to-be-deduplicated data in own data; constructing a second feature matrix locally based on second to-be-deduplicated data in the own data, and longitudinally splicing the second feature matrix and the result matrix to obtain a spliced matrix; and detecting each target row with the same corresponding position part in the splicing matrix, and performing deduplication processing on the first to-be-deduplicated data and the second to-be-deduplicated data pointed by each target row. According to the method, data deduplication can be carried out under the condition of ensuring data privacy safety without carrying out encryption operation on the to-be-deduplicated data, so that the data deduplication efficiency is improved.

Description

Technical field[0001]The present invention relates to the technical field of federal data deduplication, and in particular to a method, device, terminal device and storage medium for deduplication of vertical federal data statistics.Background technique[0002]Nowadays, the development of science and technology has already entered the era of data information, and the statistical application of data has become more and more extensive. In the data statistics scenario, it is a very common operation to de-duplicate the repeated data. For example, the user selects a specific statistical feature for the data he owns locally, and then, when the statistical feature is detected, there are multiple pieces of data and In the case that the multiple pieces of data are all the same, the user deduplicates the multiple pieces of the same data so as to keep only one piece of data under the statistical feature.[0003]However, after multiple users with their own data are combined to form a vertical feder...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/215G06F16/28G06F17/16G06F21/60G06N20/00
CPCG06F16/215G06F16/285G06F17/16G06F21/602G06N20/00
Inventor 谭明超马国强范涛陈天健杨强
Owner WEBANK (CHINA)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products