De-identification device and de-identification method

An anonymization and anonymity technology, applied in the field of anonymization, can solve the problems of tracking data set characteristics, difficult to observe data set characteristics, information loss and so on

Inactive Publication Date: 2013-07-10
NEC CORP
View PDF3 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

resulting in loss of information
[0010] There is also a problem: whenever a change occurs in a data set, anonymization suitable for the characteristics of the data set is performed, at this time the method of quasi-identifier generalization is different for each data set, and the individual data entries belong to The groupings are quite different and it is difficult to observe the characteristics of the data set in the time series and track specific data items in the time series
There is a problem: when rule-based generalization is performed to preserve k=2 anonymity and I=2 diversity, as Figure 33 As shown, the birthplace value of all data entries is "Earth" and the birthplace value is meaningless
However, there is a problem: when the optimal generalization process is performed independently each time, similar to the eighth data entry, the group to which the same entry belongs is different for each snapshot, and it is difficult to track the characteristics of the dataset in time series

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • De-identification device and de-identification method
  • De-identification device and de-identification method
  • De-identification device and de-identification method

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0070] Suppose, after getting Figure 26 After the dataset shown, Figure 27 The data entries shown are input to the anonymization processing unit 20 as additional entries to the data set. Figure 27 The birthplace of the data entry shown is "London", which cannot be based on Figure 24 is generalized according to the generalization rules shown. Therefore, when adding data entries, the criteria for anonymization are not met. Thus, the anonymization processing unit 20 outputs to the data set receiving unit 22 the Figure 26 The dataset after generalization shown and Figure 27 The data entries shown form the dataset.

[0071] The data set receiving unit 22 receives the data set from the anonymization processing unit 20 and outputs the data set to the processed data item selection unit 24 .

[0072] The processed data item selection unit 24 selects, from among the plurality of data items included in the data set, a data item that makes the data set fail to meet the criteri...

example 2

[0079] In this example, as in the example above, assume that after obtaining Figure 26 After the dataset shown, Figure 27 The data entries shown in are input to the anonymization processing unit 20 as additional entries of the data set.

[0080] The data set receiving unit 22 receives the data set from the anonymization processing unit 20 and outputs the data set to the processed data item selection unit 24 .

[0081] The processed data item selection unit 24 selects, from a plurality of data items included in the data set, a data item that, when generalized based on the generalization rule, makes the data set fail to satisfy a predetermined anonymity standard, and makes the data Set at least one data item that satisfies the anonymity criteria, even if that data item is excluded from the data set. An instance of at least one data item that satisfies a predetermined anonymity criterion for a data set (even if the data item is excluded from the data set) is Image 6 Data en...

example 3

[0087] This example is one such that data entries are further added to the dataset processed by Data Processing Example 2. Figure 8 The data set processed by Data Processing Example 2 is shown. In this example, also apply Figure 28 The generalization rule "Europe" is shown. In particular, as Figure 8 As shown, the value of the birthplace of the eleventh entry before the change is passed according to Figure 28 The generalization rules shown in to get "Europe" from "London", where "London" is Figure 27 The value of the birthplace of the data entry shown.

[0088] suppose to get Figure 8 After the dataset shown, Figure 9 The data entries shown are input to the anonymization processing unit 20 as additional entries to the data set. Figure 9 The value of birthplace of the data entry shown is "Paris". When this data entry is generalized by the anonymization processing unit 20, the value of place of birth is generalized to "Europe" and the anonymity criterion is not f...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention allows suitable generalization even in a case in which there is a possibility that a dataset may be repeatedly provided, wherein attribute information of a data entry added afterward may deviate significantly from a range of values taken by a known data entry. For each data entry of a dataset having a plurality of data entries including at least one attribute datum constituting a quasi-identifier which is information which can identify a person and at least one attribute datum other than the quasi-identifier, at least one attribute data value constituting the quasi-identifier is generalized on the basis of a predetermined generalization rule, whereupon among a plurality of data entries included in the dataset, a data entry which upon being generalized on the basis of the generalization rule causes the dataset to not satisfy a predetermined standard of anonymity, and at least one data entry which as a result of attribute data values being shared between the data entry and the object of generalization causes the dataset to satisfy the predetermined standard of anonymity, are selected, whereupon for the selected data entries, the attribute data value of the object of generalization is modified to a predetermined shared value regardless of the predetermined generalization rule.

Description

technical field [0001] The invention relates to an anonymization device and an anonymization method. Background technique [0002] In recent years, technologies for privacy-preserving data disclosure to allow secondary use of personal information (microdata) owned by companies while protecting user privacy have attracted attention. Non-Patent Document 1 proposes a technique for privacy-preserving data disclosure. Among various user information (microdata), a collection of attribute information that can identify an individual by combining it with other background knowledge is called a quasi-identifier. Attribute information that users do not want to be disclosed is called sensitive data. In anonymization, one of the techniques used for privacy-preserving data disclosure, not only are explicit user identifiers removed, but also attribute information forming quasi-identifiers is made ambiguous in order to avoid identifying individuals from combinations of these kinds of attri...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F21/62
CPCG06F21/6254
Inventor 伊东直子丰田由起
Owner NEC CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products