Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Method and device for detecting similarity of source codes

A source program and similarity technology, applied in program control devices, program control design, instruments, etc., can solve problems such as decreased similarity and inaccurate judgment results

Inactive Publication Date: 2008-12-03
BEIHANG UNIV
View PDF1 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, since the above-mentioned detection tools are all at the source code level, based on a large amount of measurement information to determine the similarity, in this way, when adding redundant statements, declaring redundant variables, or splitting statements in the source program, the above-mentioned detection tools should be used. The similarity calculated by the tool will decrease significantly, resulting in inaccurate judgment results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for detecting similarity of source codes
  • Method and device for detecting similarity of source codes
  • Method and device for detecting similarity of source codes

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] refer to figure 1 , figure 1 It is a flow chart of the method for detecting the similarity between two source programs according to the present invention. In the present invention, the source program can be written in any compiled language, such as C, C++, Java, Fortran, Pascal (Delphi, GNU Pascal, FreePascal) and other languages. Firstly, in step S10, optimize and compile the two source programs to be detected respectively, and generate two binary files. For example, this step can be performed using a commercial compiler or an open-source GCC compiler, and different optimization levels can be set in the compiler. The higher the optimization level, the better the optimization effect.

[0022] In the process of optimizing compilation, because the code typesetting format only affects the readability of the code, the compiler will ignore them, and the comments will be deleted in the preprocessing stage of compilation, thus eliminating the noise caused by modifying commen...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and a device used for detecting similarity between source programs. The method includes the following steps: two source program files are respectively optimally decoded to generate two binary files; the two binary files are disassembled to generate two assembly code units; decision function is adopted to calculate the two assembly code units to define the similarity between two source programs. The method can eliminate trouble caused by advanced plagiarism means such as code redundancy, sentence splitting and constant displacement, etc. on a program semantic level by the execution of optimal decoding and disassembling.

Description

technical field [0001] The present invention relates to similarity detection of computer programs, in particular to a method and device for detecting similarity of multiple source programs. Background technique [0002] With the continuous development of computer technology, there are more and more tasks involving programs, which leads to more and more common phenomena such as source code cloning, plagiarism, and plagiarism. Compared with natural language, the grammar of programming language is very regular, which makes plagiarism easier. Without understanding the source program, the appearance of the source code can be changed by simple variable substitution, adding redundant code, and changing the program order through a text editor without affecting the normal operation of the source program. [0003] In Metrics based plagiarism monitoring. Paper presented at the 6th Annual CSSC Nor eastern Conference, Middlebury VT.2001, Jones summarized ten plagiarism methods. Accordi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F9/44G06F9/45
Inventor 赵长海晏海华金茂忠
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products