The invention discloses a replicated data deleting method based on file content types, which belongs to the replicated data deleting method of computer data
backup, is applicable to disk-based
backup systems, and solves the problems that the existing replicated data deleting method is single in block strategies and can not carry out optimization according to the file content types. The deleting method carries out a block boundary characteristic calculation step in advance, and then comprises the following steps sequentially:
content type identification, file blocking,
digital fingerprint calculation, replicated data block judgment and ending. The deleting method carries out classification on
backup files based on content types, computes the optimal block boundary characteristic value aiming at every
content type; and when the backup files are processed, the file
content type identification step is added, and the block boundary characteristic is selected according to identification result, therefore, the overall effectiveness of the replicated data deleting method is improved when the complex backup files are processed.