Secondary k-anonymity privacy protection algorithm for differentiating quasi-identifier attributes

A quasi-identifier and privacy protection technology, applied in the fields of digital data protection, computing, computer security devices, etc., can solve the problems of information loss, loss, and poor availability of published data sheets, and achieve the effect of reducing information loss

Active Publication Date: 2016-10-12
XUZHOU MEDICAL UNIV
View PDF4 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, most data anonymization methods have common defects: 1) they are more suitable for categorical data (nominal and ordinal), and the generalization of numerical data often loses more numerical semantics; 2) quasi-identifiers When the number of attributes increases sharply, the so-called "curse of dimensionality / bit trap" will appear
The dimensionality trap will lead to a large loss of information, making the availability of published data tables worse

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Secondary k-anonymity privacy protection algorithm for differentiating quasi-identifier attributes
  • Secondary k-anonymity privacy protection algorithm for differentiating quasi-identifier attributes
  • Secondary k-anonymity privacy protection algorithm for differentiating quasi-identifier attributes

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] When realizing k-anonymity, take Table 1 as an example to define the NQLG algorithm. Suppose the data table held by the data publisher is T(A 1 ,A 2 ,...,A n ), each tuple in the table indicates the relevant information of a specific entity, such as Age, Workclass, Race, Sex, Hours-per-week, Salary, etc., see Table 1.

[0050] Table 1

[0051]

[0052] Definition 1 Quasi-identifier: Suppose a data set U, a specific data table T(A 1 ,A 2 ,...,A n ), fc:U→T and fg:T→U′, where A quasi-identifier QI of T T , is a set of attributes Then f(f c (p i )[Q T ]) = p i established. The attributes in Table 1 can all be used as quasi-identifiers, and the selection of quasi-identifiers is based on actual needs.

[0053] Definition 2 Generalization rule: Given an attribute Q, f: Q→Q', f is the set of generalization functions acting on the attribute Q, then represents the process of generalization of quasi-identifiers in sequence, and {f 1 , f 2 ,..., f m} repr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a secondary k-anonymity privacy protection algorithm for differentiating quasi-identifier attributes, pertaining to the technical field of privacy protection.The algorithm comprises following steps: forming hierarchical grids with single attribute through an Incognito function to determine whether generalization satisfies k-anonymity or not, deleting nodes not satisfying k-anonymity, iterating nodes satisfying k-anonymity to form a candidate node set and determining again whether candidate nodes satisfy k-anonymity, deleting nodes not satisfying k-anonymity, and repeating the above steps till all categorical attributes are iterated and outputting root nodes satisfying k-anonymity.Data tables T are generalized through the root nodes. The MDAV algorithm is utilized for secondary generalization of generalized T'. The number of tuples in equivalence class inputted is divided into the range of k-2k-1. When partition is finished, information loss is provided for obtaining a data table with the little loss amount through comparisons.

Description

technical field [0001] The invention relates to the technical field of data privacy protection, in particular to a secondary k-anonymous privacy protection algorithm for distinguishing quasi-identifier attributes. Background technique [0002] With the rapid development of information technology, more and more data are shared and used by people. How to protect the private information in the published data from being maliciously obtained by attackers, and at the same time enable the data receivers to make full use of the data information for effective exploration and scientific research , has increasingly become an important information security issue. k-anonymity is an effective privacy data protection method, which has received extensive attention in recent years. The k-anonymity technology was proposed by Samarati and Sweeney in 1998. It requires a certain number (k) of indistinguishable individuals in the published data, so that the attacker cannot identify the individua...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F21/62
CPCG06F16/2246G06F21/6254
Inventor 吴响王换换臧昊俞啸
Owner XUZHOU MEDICAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products