How to detect code clones

A detection method and code technology, applied in the direction of program control devices, etc., can solve the problems of incomplete detection of clone types, high complexity and high cost, and achieve the effect of improving anti-obfuscation

Active Publication Date: 2018-01-16
BEIJING UNIV OF POSTS & TELECOMM
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the cost of establishing PDG and finding isomorphic subgraphs is also very high, and it is difficult to apply to large-scale software
[0013] It can be seen that the existing code clone detection methods have problems such as incomplete detection of clone types, low accuracy, high complexity and difficult implementation.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • How to detect code clones
  • How to detect code clones

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0026] The core idea of ​​the present invention is: considering that for application programs to realize basic functions, the lowest-level API for most developers will not change. If the API frequency used by the two applications is basically the same, no matter how the plagiarist confuses the control flow and data flow of the code, the most basic API call will not change greatly. This invention will use the API call frequency to clone the code The judgment of relationship can improve the anti-obfuscation and accuracy of code clone detection, and the extraction of API is easy to implement and does not need to depend on source code, so it can also effectively reduce the difficulty of code clone detection and improve application compatibilit...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present application discloses a detection method for code cloning, which includes: extracting respectively the application programming interface (API) sets called by the two groups of program codes to be detected; determining the calling frequency of each API in the API set of each group of program codes ; For each group of program code k, according to the calling frequency of the API corresponding to it, generate an n-dimensional label vector of this group of program code, each dimension value vk in the n-dimensional label vector, i and each in the set N API one-to-one correspondence, the set N is the union of the API sets of the two sets of program codes, the vk,i is obtained according to the call frequency pk,i of the kth set of program codes to the corresponding APIi; according to the set of each set of program codes The n-dimensional label vector is used to calculate the similarity of the two groups of program codes; and determine whether there is a clone relationship between the two groups of program codes according to the similarity and a preset similarity threshold. By adopting the invention, the anti-aliasing performance of detection can be improved, the accuracy is high and it is easy to realize.

Description

technical field [0001] The invention relates to computer application technology, in particular to a code clone detection method. Background technique [0002] Code clone (Code Clone) refers to the repeated occurrence of the same or similar code fragments in the software source code. These code fragments may be identical, or may have undergone some editorial (such as modifying variable names) or logical modifications (such as modifying to similar but not identical functions). Code fragments considered to be clones of each other often have similar logical operations and achieve similar functions. Code cloning is generally caused by copy-and-paste code reuse, or it may be caused by patterned thinking to solve similar problems. Code cloning is abundant in large software systems as well as in several similar software systems. Cloning code is closely related to many issues in software engineering, such as software quality, complexity, architecture, evolution, patents, and plagi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/44
Inventor 张程鹏李祺李承泽董枫杨昕雨
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products