Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Protection method and system for HDFS access mode

An access mode and data node technology, applied in the fields of digital data protection, instruments, computing, etc., can solve the problems of reducing attack cost, inability to obtain user data storage location, access frequency and access sequence privacy information, confusion, etc., to enhance security. sexual effect

Active Publication Date: 2019-09-17
PEKING UNIV
View PDF8 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

ORAM can hide the data access mode and confuse each access, so that the attacker cannot distinguish whether the access is real or fake, so the attacker will not be able to obtain privacy information such as user data storage location, access frequency, and access sequence. It is also impossible to further infer information such as the content and importance of user data
[0004] The shortcomings and limitations of existing methods are: HDFS, as a distributed storage system widely used in industry and academia, cannot resist all types of attacks only by data encryption, and attackers can still infer privacy through user access patterns. Information, there is no relevant research to realize the protection scheme for HDFS user access mode, which undoubtedly creates a huge security risk for many companies and individual users who use HDFS for distributed storage
The number of nodes in the cluster may be large in magnitude. If an attacker cannot grasp the user's access frequency, he needs to attack tens of thousands of nodes. This is obviously unrealistic, so the leakage of access frequency greatly reduces the attack cost

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Protection method and system for HDFS access mode
  • Protection method and system for HDFS access mode

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] In order to make the above objects, features and advantages of the present invention more comprehensible, the present invention will be further described in detail below through specific embodiments and accompanying drawings.

[0025]The code realization of the present invention is based on Hadoop 2.8.4 source code, revises source code according to function module and file read-write process, and the java file of main modification is CommandWithDestination.java, IOUtils.java and PathData.java, and newly added java file is FileBuffer. java, TreeNode.java and TreeORAM.java, the total code size is 4686 lines. The specific operation process is described below.

[0026] 1. Initialization

[0027] After the HDFS cluster is started, an initialization operation is required to transfer several dummy files to the data nodes. The dummyfile is an invalid file, and the difference between the dummy file and the real file cannot be distinguished outside the client. The role of initi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a protection method and system for an HDFS access mode. The method comprises the following steps: decomposing a reading operation and a writing operation on a data node of an HDFS cluster into two atomic operations of reading first and writing second so as to hide an operation type of a file; adding obfuscated data blocks for the file before writing the file to the data node so as to hide the number of blocks of the file; deleting the file from the data node after reading the file each time, and randomly selecting one file in the file buffer area of the client to write back to the data node so as to hide the position of the data node stored in the file; and hiding the access frequency and the access sequence of the file through the continuous change of the storage position of the file. According to the invention, the design and implementation of the HDFS access mode protection scheme based on the ORAM technology are provided, the blank of HDFS access mode protection is filled, and the brought performance overhead is in an acceptable range while the HDFS security is enhanced.

Description

technical field [0001] The invention relates to data protection of Hadoop distributed file system (HDFS), in particular to a protection method and system for HDFS access mode. Background technique [0002] HDFS (Hadoop Distributed File System) is the core distributed file system of Hadoop. HDFS is often used to store large files, similar to traditional distributed systems. When the size of a data set exceeds the storage limit of a computer, the data set is divided and stored on a large number of cheap servers. HDFS is currently widely used in industry and academia. In recent years, people have higher and higher requirements for data privacy protection, which undoubtedly brings greater challenges to the data privacy protection capabilities of HDFS. For the current version of the HDFS system, the system design focuses on data availability and data integrity in terms of data security, and there are relatively few strategies for protecting data confidentiality. For example, a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F21/62
CPCG06F21/6227G06F21/6245
Inventor 沈晴霓秦嘉吴鹏飞康雨城刘忠开
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products