Network threat intelligence text key information extraction method based on weak supervised learning

A key information, weakly supervised technology, applied in neural learning methods, biological neural network models, digital data information retrieval, etc. Qualitative indicators evaluate the effectiveness of screening schemes, improve interpretability and accuracy, and avoid potential errors

Active Publication Date: 2022-03-04
SICHUAN UNIV
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But it is precisely because of this high-dimensional abstract nature that it is difficult for classification methods to quantitatively analyze text content like entity extraction to obtain quantitative and qualitative descriptions of higher abstract entities.
Although some scholars have carried out clustering based on lexical correlation methods and similar lexical reasoning for entity recognition, such as Linear Discriminant Analysis (LDA), these methods cannot effectively evaluate the boundaries of similar lexical data, or form more abstract conceptual entity settlement

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Network threat intelligence text key information extraction method based on weak supervised learning
  • Network threat intelligence text key information extraction method based on weak supervised learning
  • Network threat intelligence text key information extraction method based on weak supervised learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0061] The present invention will be described in further detail below in conjunction with the accompanying drawings and specific implementation examples.

[0062] The system structure of the inventive method is as figure 1 As shown, the method consists of the text key information screening deep learning method SeqMask based on weakly supervised learning in S3 and the key information evaluation method in S4. The following specific implementation scenarios will take the application scenario of network threat intelligence technical and tactical analysis as an example.

[0063] Step S1: Preprocess the text information uploaded by web crawlers and users, clean the data and divide into sentences to form an analysis corpus; determine the text theme through the original storage environment, collection method, research field, etc. of the text, and form the sentence theme through screening and other methods Label.

[0064] Text preprocessing is based on information collection of cybe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a network threat intelligence text key information extraction method based on weak supervised learning, and the method comprises the steps: converting information extraction into a feature data space mapping task, combining a knowledge representation learning method, adopting an attention mechanism based on a local sequence, and utilizing a text theme label to extract key information of a network threat intelligence text; key information extraction of a weak supervised learning text is realized, the quality of the extracted information is verified by using a manual evaluation and confidence evaluation mode, and the information is ensured to be real, reliable and credible. The method comprises the following steps: training a more accurate, comparative and basis key information extraction model for a text, and hoping that the key information formed by the extraction model can reflect the actual semantic value of a sequence tag rationally; through a weak supervised learning strategy and two evaluation methods defined by the method, through end-to-end network training, the complexity and time cost of information extraction are reduced, and the accuracy and recall rate of key information extracted by the method in label classification are improved.

Description

technical field [0001] The invention relates to the technical fields of natural language processing and cyberspace security, in particular to a method for extracting key information of network threat intelligence texts based on weakly supervised learning. Background technique [0002] Cyber ​​Threat Intelligence (CTI), as the main information carrier of shared event information attack methods, is recommended by most security analysis resource sharing platforms, providing the necessary technology for the restoration of most similar attack event scenarios and the tracking of attacking organizations The background, attack process reference, and attack method analysis have become the current mainstream data sources for network threat event analysis. The analysis project of threat intelligence is mainly to understand the logic of multimedia data such as the text of threat intelligence collected from various channels, connect the plot process of events in series, and supplement th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/335G06F16/35G06F40/289G06F40/30G06N3/04G06N3/08G06N5/02
CPCG06F16/335G06F16/35G06N3/08G06N5/02G06F40/289G06F40/30G06N3/045Y02D10/00
Inventor 王俊峰葛文翰唐宾徽于忠坤陈柏翰余坚
Owner SICHUAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products