Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data storage method and device for distributed file system

A distributed file and data storage technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of reducing node IO performance, busyness, cluster availability decline, etc.

Inactive Publication Date: 2018-01-09
ZHENGZHOU YUNHAI INFORMATION TECH CO LTD
View PDF5 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The traditional distributed file system uses a pseudo-random value to generate a hash function. The hash function generates a data allocation strategy based on the remaining space of the cluster nodes. However, when storing some small files, since small files only occupy less Space, the allocation strategy generated by the hash function will only store small files in a certain node in the cluster. When the number of small files is large, frequent writing and reading of small files will cause the node to IO is busy, so the IO performance of the node will be relatively reduced, which will cause the overall availability of the cluster to decrease

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data storage method and device for distributed file system
  • Data storage method and device for distributed file system
  • Data storage method and device for distributed file system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0046] figure 1 It is a flowchart of a data storage method for a distributed file system provided by an embodiment of the present invention. Please refer to figure 1 , the specific steps of the data storage method of the distributed file system include:

[0047] Step S10: Count the preset parameters of the data nodes in the distributed file system cluster.

[0048] The preset parameters include at least a data read and write frequency.

[0049] It can be understood that, since the basis for selecting data nodes for data storage in the present invention is the data reading and writing frequency of the data node, and the data reading and writing frequency can reflect whether the data node is frequently accessed and used for data reading and writing, Therefore, data nodes with high data read and write frequency will have lower performance than data nodes with low data read and write frequency because they read and write data more frequently. It should be noted that since the ...

Embodiment 2

[0056] figure 2It is a flow chart of another data storage method for a distributed file system provided by an embodiment of the present invention. figure 2 In steps S10-S12 and figure 1 Same, no more details here.

[0057] On the basis of the above embodiments, as a preferred implementation manner, the preset parameters also include the remaining capacity of the data node;

[0058] Correspondingly, the method further includes:

[0059] Set the data volume threshold of the data;

[0060] Determine whether the total amount of data meets the data volume threshold;

[0061] If so, select the data node with the lowest reading and writing frequency as the target node based on the preset parameters as the target node:

[0062] Based on the preset parameters, the data node with the lowest reading and writing frequency is selected as the target node among the data nodes with the largest remaining capacity.

[0063] It should be noted that the purpose of setting the data volume ...

Embodiment 3

[0080] An embodiment of a data storage method for a distributed file system has been described in detail above, and the present invention also provides a data storage device for a distributed file system. Since the embodiment of the device part and the embodiment of the method part Corresponding to each other, so for the embodiment of the device part, please refer to the description of the embodiment of the method part, and details will not be repeated here.

[0081] image 3 It is a structural diagram of a data storage device of a distributed file system provided by an embodiment of the present invention. Such as image 3 As shown, a data storage device for a distributed file system provided by an embodiment of the present invention includes:

[0082] The statistical module 10 is used for counting the preset parameters of the data nodes in the distributed file system cluster.

[0083] The node selection module 11 is configured to select the data node with the lowest readin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data storage method and device for a distributed file system. The method includes the steps that preset parameters of data nodes in a distributed file system cluster are counted; with the preset parameters being a basis, the data node with the lowest read-write frequency is selected from the data nodes and serves as a target node; data is obtained and stored into the target node. Understandably, the data nodes with low read-write frequency are not used frequently in the cluster, in this way, it can be considered that the data node has higher IO performance compared with other data nodes frequently used, and by selecting the data nodes with the low read-write frequency for data storage, the overall processing performance of the cluster can be balanced for the data.In addition, when small files are stored, it is avoided that because the quantity of the small files is large, node IO is busy, the IO performance of the nodes is guaranteed, and then the high availability of the cluster is ensured. In addition, the data storage device has the advantages.

Description

technical field [0001] The invention relates to the field of distributed clusters, in particular to a data storage method and device of a distributed file system. Background technique [0002] With the development of the Internet and the continuous increase of Internet users, the data generated in the Internet is also expanding rapidly, and hundreds of millions of new data are generated every day. [0003] A single computer is limited by hardware such as memory and CPU, and cannot meet the requirements for massive data storage and computing. A distributed file system for the processing of massive data emerged as the times require. The distributed file system technology stores data in multiple physically dispersed storage nodes in the cluster, uniformly allocates and manages the resources of the nodes in the cluster, and provides users with Interface for accessing files. Due to the distributed excellent characteristics of the distributed file system, the data can be split i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F3/06
Inventor 吴蜀魏
Owner ZHENGZHOU YUNHAI INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products