The invention provides a method and a
system for automatically extracting
virus characteristics based on family samples. According to the method and the
system, a longest public subsequence
algorithm is modified, a sequence A and a sequence B are established by using samples in the family samples, Hash values of subsequences with lengths equal to preset values in the sequence A and the sequence B are calculated respectively through preset
feature code lengths, and the Hash values of the subsequences in the sequence A and the sequence B are matched through a
red black tree manner, if the Hash values are same, the subsequences corresponding to the Hash values are public subsequences of the sequence A and the sequence B, and the public subsequences are feature codes of the family samples; and when surplus samples are taken as the sequence B and searched in a
red black tree, feature codes of all family samples are obtained and combined into a
feature set of the family samples,
a weighting model is evaluated according to qualities of the established feature codes, the qualities of the established feature codes are judged, and the feature codes of the family samples are determined. According to the method, the
time complexity of the
algorithm is simplified, and the extraction efficiency and the accuracy of the feature codes are improved.