Linear SVM model training algorithm for privacy protection based on vector homomorphic encryption

A homomorphic encryption, privacy protection technology, applied in the field of information technology security, can solve the problems of user privacy leakage, training data information is not private, etc.

Active Publication Date: 2018-09-11
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF3 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] The purpose of the present invention is to provide a privacy protection method based on vector homomorphic encryption in order to solve the problem that the training data information does not have privacy in the process of training the SVM model on the cloud platform, which leads to the leakage of the user's privacy. Linear SVM model training algorithm

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Linear SVM model training algorithm for privacy protection based on vector homomorphic encryption
  • Linear SVM model training algorithm for privacy protection based on vector homomorphic encryption
  • Linear SVM model training algorithm for privacy protection based on vector homomorphic encryption

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0051] A privacy-preserving linear SVM model training algorithm based on vector homomorphic encryption. It is assumed that the training data set X is a matrix composed of t z-dimensional vectors, and each piece of data in the training data set X has a label value representing a category. The label values ​​are arranged in the order of each piece of data in the training data set X to form a data label vector y[y 1 ,...,y t ]. The linear SVM model training algorithm includes the following steps:

[0052] Step 1: The user uses the vector-based homomorphic encryption scheme VHE to encrypt the training data set, and sends the encryption result to the server. Include the following steps:

[0053] S1.1. Initialize the key matrix S[S 11 ,...,S wv ].

[0054] The user first determines the number of rows and columns of the key matrix S, and then randomly generates the values ​​in the matrix.

[0055] S1.2. Encrypt the training data set X through the key matrix S to obtain the cip...

Embodiment 2

[0090] On the basis of Embodiment 1, when processing the polynomial kernel function containing the linear kernel function, the polynomial kernel function is first split into two parts: the linear kernel function and the nonlinear kernel function. Firstly, the linear kernel function is calculated under the ciphertext and the kernel function table of the plaintext is obtained, and then the value in the kernel function table is added to the value of the kernel function table under the plaintext by adding 1 again to complete the calculation of the polynomial kernel function. Because the linear kernel function part is calculated under the ciphertext, the training data set is in a confidential state, and the server still cannot obtain the information of the training data set after the operation of adding 1 and the power, so the SVM model of the polynomial kernel function can also be used Trained under ciphertext.

Embodiment 3

[0092] On the basis of Embodiment 1, when processing the Gaussian kernel function containing the linear kernel function, it is necessary to use the Euclidean distance between any two in the training data, and VHE can calculate the distance between two vectors under the ciphertext , so the calculation of the Gaussian kernel function can be split into two parts: the linear kernel function and the Gaussian function. First calculate the distance between any two vectors under the ciphertext, and then calculate the Gaussian function under the plaintext. Since the distance between vectors is calculated in ciphertext and the training data is kept secret, the server cannot deduce the specific values ​​of the two vectors from the vector distance. Even if the Gaussian function is calculated in plaintext, the server still cannot get information about the training data. set of information.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a linear SVM model training algorithm for privacy protection based on vector homomorphic encryption and belongs to the field of information technology security. The method comprises the following steps: step 1, a user encrypts a training data set by using a vector-based homomorphic encryption scheme VHE and sends the encryption result to a server; step 2, the server calculates the encryption result, obtains a ciphertext linear kernel function matrix and returns the ciphertext linear kernel function matrix to the user; step 3, the user decrypts the ciphertext linear kernel function matrix to obtain a plaintext linear kernel function matrix and sends the plaintext linear kernel function matrix to the server; step 4, the server adopts a ciphertext SMO algorithm to train the plaintext linear kernel function matrix and returns the training result to the user.

Description

technical field [0001] The invention belongs to the field of information technology security, and in particular relates to a privacy-protected linear SVM model training algorithm based on vector homomorphic encryption. Background technique [0002] Support Vector Machine (SVM) is an important model for classification and regression analysis in machine learning theory. It establishes a mathematical quadratic programming model, uses the existing training data, solves the model, finds the optimal decision boundary, and then uses the decision boundary to predict the data and judge its category. The core idea of ​​SVM is to treat the training data as points in the space given a set of training data and corresponding labels, and find an interface in the space that can divide the training set data into Second, the data of the same category are on the same side of the interface, and different categories are separated by the interface. Then, in the case of ensuring that all trainin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L9/00G06K9/00G06K9/62
CPCH04L9/008G06V10/96G06F18/2411G06F18/24
Inventor 杨浩淼从鑫张可黄云帆何伟超张有李洪伟
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products