Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Code clone detection method and system based on byte code and neural network

A neural network and detection method technology, applied in the field of code clone detection, can solve problems such as code confusion, bytecode files cannot be decompiled into source code, and code clone cannot be detected, achieving the effect of low detection time

Pending Publication Date: 2022-02-18
CHONGQING UNIV OF POSTS & TELECOMM
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] 1. The detection accuracy (Precision), recall rate (Recall) and F1 measurement value of the text-based and lexical-based detection method on type 3 and type 4 clones are too low
[0005] 2. Detection methods based on syntax and semantics often need to build abstract syntax trees, program dependency graphs, etc., with high time and space complexity
[0006] 3. In some scenarios (such as code obfuscation of bytecode files), bytecode files cannot be decompiled into source code
At this point, source-based comparison methods cannot detect potential code cloning

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Code clone detection method and system based on byte code and neural network
  • Code clone detection method and system based on byte code and neural network
  • Code clone detection method and system based on byte code and neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0055] A code clone detection method based on bytecode and neural network, the method comprising: obtaining code data to be detected, inputting the code data to be detected into a trained code clone detection model, obtaining a detection result, and converting the detection result to Save the markup.

[0056] The process of training the code clone detection model includes:

[0057] S1: Obtain the original code data set; preprocess the code data to obtain the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of code clone detection, and particularly relates to a code clone detection method and system based on byte codes and a neural network. The method comprises the steps: obtaining to-be-detected code data, inputting the to-be-detected code data into a trained code clone detection model, and obtaining a detection result, marking and storing the detection result; according to the method, source codes are replaced with byte codes, compared with an existing detection method based on texts and lexical methods, the method has the advantages that code semantic information is fully considered, and the detection effect on type 3 clone and type 4 clone can be improved from the aspects of accuracy, the recall rate, the F1 metric value and the like.

Description

technical field [0001] The invention belongs to the technical field of code clone detection, and in particular relates to a code clone detection method and system based on bytecode and neural network. Background technique [0002] Code clone, also known as duplicate code or similar code, refers to two or more identical or similar source code fragments that exist in a code base. Code clone detection is a difficult problem in the field of software engineering. Code cloning is divided into four categories, namely Type 1 (Type-1) cloning, Type 2 (Type-2) cloning, Type 3 (Type-3) cloning and Type 4 (Type-4) cloning. Among them, type 1 cloning refers to a pair of codes in which two code fragments are identical except for spaces and comments; type 2 cloning refers to a type 1, except for variable names, type names, and function names. code pair; type 3 cloning refers to the structure of two code fragments on the basis of type 2, and there are additions, reductions or modification...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F8/75G06K9/62G06N3/08
CPCG06F8/751G06N3/08G06F18/24
Inventor 万邦睿董双黄江平钱鹰
Owner CHONGQING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products