Source code vulnerability detection method for code graph representation learning based on graph convolution network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A vulnerability detection and convolutional network technology, applied in neural learning methods, biological neural network models, instruments, etc., can solve problems such as low learning efficiency of graph structure, missed vulnerability reporting, and failure to consider function call data dependencies, etc., to reduce leakage The effect of increasing the reporting rate, improving the accuracy rate, and reducing the scale

Active Publication Date: 2020-10-16

HARBIN INST OF TECH

View PDF12 Cites 32 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

This method uses AST as the backbone to explicitly encode the control and data dependencies of the program. When the function scale is large, there is a problem that the graph structure is too deep and too large, resulting in low learning efficiency, and the analysis of the code by this method is limited to one function. , does not consider the data dependence between the function call and the process, and may cause false negatives for the vulnerabilities of cross-function calls

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0057] Taking a source file with 1088 lines of code as an example, the result of slicing the program according to a possible vulnerability key point is only 48 lines of statements, as shown below:

[0058]

[0059] The vulnerability contained in this source file occurs on line 29. When using the strcpy function, strcpy is used without verifying the result of the stonesoup_buffer applied by malloc. If the malloc application fails, the use of strcpy will also cause illegal memory access.

[0060] The method proposed by the present invention can detect that the code contains loopholes related to unsafe function calls after performing graph representation learning on the code based on the graph convolutional network. Long codes like this, when using other traditional deep learning models (such as LSTM and GRU) that use serialized data as input, may lose key contextual information during batch learning, resulting in traditional deep learning models Does not adapt well to codes o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a source code vulnerability detection method for code graph representation learning based on a graph convolution network. The method comprises the following steps: generating acode attribute graph; adding a function calling relationship and an inter-process dependency relationship into the code attribute graph; obtaining code slices according to the vulnerability key points; deleting nodes in the graph by utilizing slices, and extracting graph structure information related to vulnerabilities; learning vector representation of each node by using a graph convolution network; dividing the sub-graphs according to the types of the edges, and obtaining vector representation of the graphs through a READOUT model based on an attention mechanism; adjusting network parameters according to the vector representation and the label of the graph; and detecting the code vulnerability by using the trained model. According to the method, the structure and attribute information of the vulnerability code can be fully utilized and learned, the problems that code structure information is easy to lose when a traditional deep network learns code representation and long code context information is lost due to the fact that the code needs to be represented as a fixed length sequence are avoided, and false alarm and missing alarm of vulnerability detection are reduced.

Description

technical field [0001] The invention relates to a method for detecting a software vulnerability, in particular to a method for detecting a source code vulnerability based on a graph convolutional network for learning the graph representation of the code. Background technique [0002] Software vulnerabilities are defects that are easily exploited by malicious attackers in the process of software design and development. The traditional source code review technology largely depends on the reviewer's understanding of security issues and the accumulation of long-term experience, and cannot meet the needs of vulnerability detection when the code size and complexity are increasing. Although the vulnerability detection method based on machine learning avoids the problem that the rule-based vulnerability detection method relies on experts to manually write detection rules, it still needs to manually extract vulnerability features. In recent years, the deep learning technology succes...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F21/57G06F21/56G06N3/04G06N3/08

CPCG06F21/577G06F21/563G06N3/08G06N3/045

Inventor 苏小红段亚男王甜甜蒋远赵玲玲

Owner HARBIN INST OF TECH

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Source code vulnerability detection method for code graph representation learning based on graph convolution network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology