The invention relates to a similarity detection method of a
computer software source code, and belongs to the technical field of computer application. The method comprises the following steps: firstly, according to different
programming languages, carrying out a word segmentation operation on the
source code; then, selecting a specific labeling word to carry out partitioning
processing on a word segmentation result, and carrying out relevant
processing on a variable segmentation word according to variable attributes; thirdly, on the basis of a word segmentation result, carrying out a difference measurement operation on each block to obtain a difference matrix, and obtaining integral difference according to the difference result and the correlation of each block; and finally, according to a formula, finally obtaining a code similarity detection result. Compared with the prior art, the method can successfully identify means including word-for-word
copying, comment statement blank
area change, identifier renaming,
data type change and the like in the similarity detection of the code, and can successfully detect means that a code block sequence is changed, a statement sequence is changed, redundant statements and variables are increased, an original control structure is replaced with an
equivalent control structure and the like.