Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Source code vulnerability detection method for code graph representation learning based on graph convolution network

A vulnerability detection and convolutional network technology, applied in neural learning methods, biological neural network models, instruments, etc., can solve problems such as low learning efficiency of graph structure, missed vulnerability reporting, and failure to consider function call data dependencies, etc., to reduce leakage The effect of increasing the reporting rate, improving the accuracy rate, and reducing the scale

Active Publication Date: 2020-10-16
HARBIN INST OF TECH
View PDF12 Cites 32 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method uses AST as the backbone to explicitly encode the control and data dependencies of the program. When the function scale is large, there is a problem that the graph structure is too deep and too large, resulting in low learning efficiency, and the analysis of the code by this method is limited to one function. , does not consider the data dependence between the function call and the process, and may cause false negatives for the vulnerabilities of cross-function calls

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Source code vulnerability detection method for code graph representation learning based on graph convolution network
  • Source code vulnerability detection method for code graph representation learning based on graph convolution network
  • Source code vulnerability detection method for code graph representation learning based on graph convolution network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0057] Taking a source file with 1088 lines of code as an example, the result of slicing the program according to a possible vulnerability key point is only 48 lines of statements, as shown below:

[0058]

[0059] The vulnerability contained in this source file occurs on line 29. When using the strcpy function, strcpy is used without verifying the result of the stonesoup_buffer applied by malloc. If the malloc application fails, the use of strcpy will also cause illegal memory access.

[0060] The method proposed by the present invention can detect that the code contains loopholes related to unsafe function calls after performing graph representation learning on the code based on the graph convolutional network. Long codes like this, when using other traditional deep learning models (such as LSTM and GRU) that use serialized data as input, may lose key contextual information during batch learning, resulting in traditional deep learning models Does not adapt well to codes o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a source code vulnerability detection method for code graph representation learning based on a graph convolution network. The method comprises the following steps: generating acode attribute graph; adding a function calling relationship and an inter-process dependency relationship into the code attribute graph; obtaining code slices according to the vulnerability key points; deleting nodes in the graph by utilizing slices, and extracting graph structure information related to vulnerabilities; learning vector representation of each node by using a graph convolution network; dividing the sub-graphs according to the types of the edges, and obtaining vector representation of the graphs through a READOUT model based on an attention mechanism; adjusting network parameters according to the vector representation and the label of the graph; and detecting the code vulnerability by using the trained model. According to the method, the structure and attribute information of the vulnerability code can be fully utilized and learned, the problems that code structure information is easy to lose when a traditional deep network learns code representation and long code context information is lost due to the fact that the code needs to be represented as a fixed length sequence are avoided, and false alarm and missing alarm of vulnerability detection are reduced.

Description

technical field [0001] The invention relates to a method for detecting a software vulnerability, in particular to a method for detecting a source code vulnerability based on a graph convolutional network for learning the graph representation of the code. Background technique [0002] Software vulnerabilities are defects that are easily exploited by malicious attackers in the process of software design and development. The traditional source code review technology largely depends on the reviewer's understanding of security issues and the accumulation of long-term experience, and cannot meet the needs of vulnerability detection when the code size and complexity are increasing. Although the vulnerability detection method based on machine learning avoids the problem that the rule-based vulnerability detection method relies on experts to manually write detection rules, it still needs to manually extract vulnerability features. In recent years, the deep learning technology succes...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F21/57G06F21/56G06N3/04G06N3/08
CPCG06F21/577G06F21/563G06N3/08G06N3/045
Inventor 苏小红段亚男王甜甜蒋远赵玲玲
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products