The invention discloses a small file optimization storage method based on a HDFS, wherein the efficiency for the HDFS to read
small files is improved, and the overall performance of the
system is improved. The method includes the steps that first, the
small files are combined and undergo storage preprocessing, wherein the storage preprocessing on the
small files is achieved through filtering of the filers, combination of the small files, generation of
metadata and generation of object IDs; second, after the files are stored into the HDFS in a combined mode, the mapping relations between the small files and combined files in the HDFS are stored into the
metadata of the small files in a mode of file
metadata, a
directory structure of the files is stored in a file name, and the metadata are stored in a mode of distributed clusters on the basis of a Chord protocol; third, the
directory structure of the files is optimized, and generated key values of the metadata are decomposed into
Directory IDs and Small File IDs. The
Directory IDs serve as key values for the metadata to skip into nodes in a metadata cluster, and therefore the files under the same
directory are stored into the same node. The Small File IDs are generated in metadata nodes, and therefore each of the metadata corresponds to unique ID identification in the whole
system.