The invention discloses a
privacy protection method in multi-sensitive-attribute
data release, and solves the problem of
poor quality of quasi-identifier data in multi-sensitive-attribute
data release. The basic thinking of the invention is as follows that: firstly, clustering is executed on data sets, the data sets of which quasi-identifiers are similar are aggregated into one aggregate, and a plurality of data aggregates are generated; secondly, a multi-dimension bucket structure is constructed on the basis of sensitive attributes, and
data records are mapped into the multi-dimension bucket structure according to values of the sensitive attributes; and then on the basis of multi-dimension buckets, grouping is carried out, i.e., main sensitive attributes are selected, dimension capacity of the main sensitive attributes is calculated, L (L is greater than or equal to 2) main sensitive attributes with the
maximum dimension capacity are selected, one data
record is respectively selected from the L main sensitive attributes, whether the
data records meet the multi-sensitive-attribute L-diversity is judged, and if not, each bucket is sequentially traversed according to the capacity from big to small until the
data records meet the multi-sensitive-attribute L-diversity. The process is repeated until the data in the buckets do not meet the multi-sensitive-attribute L-diversity. Finally, all groups are subjected to anonymization
processing.