Skyline-based data generalization method

A generalization and data technology, applied in the field of privacy-protected data publishing, can solve problems such as failure to meet the multi-level needs of users, low accuracy recommendation strategies, small RU space coverage, etc., to achieve wide coverage and reduce scale , the effect of reducing the size

Active Publication Date: 2017-09-22
HUAZHONG UNIV OF SCI & TECH
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In view of the defects of the existing technology, the purpose of the present invention is to solve the problem that the existing privacy protection method has a small RU space coverage, cannot meet the multi-level needs of users, and has poor performance in both the risk amount R and the information loss amount U. Technical problems of low accuracy, small range of recommended strategies, and slow algorithm convergence

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Skyline-based data generalization method
  • Skyline-based data generalization method
  • Skyline-based data generalization method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

[0035] figure 1 A schematic flowchart of a method for data generalization based on Skyline provided by an embodiment of the present invention includes steps S101 to S103.

[0036] Step S101, according to the data release privacy protection standard 10-anonymous processing of the data table, the risk amount R of the re-identification of the strategy is recorded as a threshold value T, and the strategy space is determined according to the range of the quasi-identifier attribute and the threshold value T. {S, (R, U)}, U is the information loss of the strategy, and the R value of the strategy...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Skyline-based data generalization method. The method comprises the steps of processing a data table according to a data release privacy protection standard 10-anonymity to obtain a re-identified risk quantity R of a policy, recording the risk quantity R as a threshold T, and determining a policy space {S,(R,U)} according to a value domain of a quasi-identifier attribute and the threshold T, wherein an R value of the policy comprised in the policy space {S,(R,U)} is not greater than the threshold T; filtering the policy space {S,(R,U)} by adopting epsilon-approximate Skyline to obtain candidate policy spaces {G,(R,U)}; and performing Skyline calculation on the candidate policy space {G,(R,U)} to obtain a recommended policy space {F,(R,U)}, wherein the recommended policy space {F,(R,U)} is a private policy space recommended for the data table. According to the method, the accuracy of privacy protection policy recommendation is improved through an enumeration full policy space; the coverage range of an RU space is wide; multilevel demands of a user are met; the threshold T is set and the privacy protection policies not meeting the requirements are filtered, so that the policy space generation time is shortened; and the filtering is performed by adopting the epsilon-approximate Skyline, so that the scale of the candidate policy spaces is further reduced.

Description

technical field [0001] The invention belongs to the field of privacy-protected data publishing, and more specifically relates to a Skyline-based data generalization method. Background technique [0002] In the digital information age, it is becoming more and more important to exchange and publish data among various groups (such as governments, enterprises, individuals, etc.). For example, hospitals in California generally need to submit some medical data for certain recovered patients. These data contain some sensitive information, and publishing them directly will reveal personal privacy. Example of "connection attack" mentioned by L. Sweeney. By connecting the patient information table and the voter information table through attributes (Age, Sex, Zipcode), it can be determined that Ahmed has the flu. Patient privacy has been compromised. This method of data distribution is not secure. Therefore privacy-preserving data publishing is proposed. It requires maximizing da...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F21/62
CPCG06F21/6245
Inventor 丁晓锋金海王丽
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products