Extraction method of Internet political and diplomatic news events

An event extraction and Internet technology, applied in the field of text information extraction, can solve the problems that deep learning cannot be applied in the field of political and diplomatic news, the workload of triggering the vocabulary is large, and the accuracy is affected, so as to achieve excellent accuracy and improve accuracy. And the effect of high recall and accuracy

Active Publication Date: 2022-07-29
10TH RES INST OF CETC
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The second is the filling of the content of the event argument role. In one sentence, multiple event elements may be extracted for the same event element type. How to select appropriate elements to fill the content of the event argument will also affect the accuracy of the final result of event extraction. profound influence
Existing technology extracts event trigger words based on the end-to-end event extraction model of deep neural network. The construction of the trigger word table is mainly by calculating word frequency and selecting relevant verb keywords as trigger words. Although deep learning can greatly reduce the feature engineering of manual participation To "fit" the training data, but this does not mean that people are not required to participate in the selection of features at all, especially the data is simply massive political and diplomatic news events, and the manual construction of trigger vocabulary by experts is a very heavy workload, and it is very It is cumbersome, unless it is clear which data has potential value, how to do proper preprocessing and how to transform and achieve what goals, otherwise deep learning cannot be applied in the field of political and diplomatic news

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Extraction method of Internet political and diplomatic news events
  • Extraction method of Internet political and diplomatic news events
  • Extraction method of Internet political and diplomatic news events

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] like figure 1 As shown, according to the present invention, for Internet political and diplomatic news events, an initial trigger word set is manually constructed, event categories are defined according to the trigger word set, and a trigger word list and events including trigger words and event argument roles are constructed for each type of event. Category templates, combined with text-dependent syntax, analyze, identify, and extract event elements in foreign and political fields. Text preprocessing: split sentences according to commas and periods, perform word segmentation and part-of-speech tagging on a single document, and complete single-text preprocessing operations; text preprocessing and event-triggered vocabulary expansion: According to the trigger vocabulary, determine whether the sentence contains Trigger words or words similar to trigger words, calculate the similarity of similar words, verbs and trigger words in the sentence, and calculate and expand the c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method for extracting news events of Internet politics and diplomacy disclosed by the present invention aims to provide an extracting method capable of improving the accuracy of event identification. The word set defines event categories, constructs trigger vocabulary and event category templates containing trigger words and event argument roles for each type of event; combines text-dependent syntax to analyze, identify and extract event elements in foreign and political fields. The single-text preprocessing operation is completed, and the category event trigger words are calculated and expanded based on the sememe similarity; the sentence that meets the threshold of similarity is taken as the candidate event sentence. Filter the event elements that satisfy the event category template, and extract the entity elements in the event sentence; then fill the event elements into the corresponding argument roles according to the event template; filter the candidate event elements that meet the category template; generate the event structure according to the event template Description file to build an event library in the field of foreign affairs.

Description

technical field [0001] The invention relates to the technical field of text information extraction, in particular to a method for extracting Internet political and diplomatic news events. Background technique [0002] With the rapid development of science and technology, various sources of news data emerge in an endless stream, resulting in a rapid increase in the amount of multi-source, multi-category, and heterogeneous news data. As an important data source of open source intelligence, news data has the characteristics of high real-time and massive. How to find the desired target information from a large amount of unstructured news data, and how to conduct in-depth mining, analysis, and prediction of the attention target in a large amount of unstructured news data, is the data situation of various countries in the face of massive news data. Perception, risk warning and other key concerns and problems to be solved urgently. [0003] Structural transformation of unstructur...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/31G06F16/33G06F40/211G06F40/289
CPCG06F16/313G06F16/3344
Inventor 崔莹代翔孙涛潘磊丁洪丽
Owner 10TH RES INST OF CETC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products