Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Extraction method aiming at Twitter text event

An event extraction and text technology, applied in the information field, can solve the problems of time and place constraints, confusion between events and topics, and the inability to directly provide the time and place of the incident, so as to achieve the effect of rapid detection and discovery.

Inactive Publication Date: 2016-10-26
NAT UNIV OF DEFENSE TECH
View PDF3 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The problems of the former method are: confuse events with topics, use word frequency vectors or probability distributions of keywords (mainly entity names and trigger words) to describe events formally, and use unsupervised clustering to realize event discovery. As a result, what is detected is often a collection of a series of events (actually topics), and the detection results generally do not contain important information such as the time and place of the incident, the participating groups, etc.
The problem with the latter method is that since the time and location constraints are usually not added to the event message identification process, what is detected is often only a collection of event tweets, and generally cannot directly provide important information such as the time and location of the incident.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Extraction method aiming at Twitter text event
  • Extraction method aiming at Twitter text event
  • Extraction method aiming at Twitter text event

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The present invention will be further described below by specific examples.

[0024] The method for Twitter text event extraction of the present invention mainly comprises the following steps:

[0025] Step 1, collecting tweet data from the Twitter platform and storing it in the database;

[0026] Step 2, text data preprocessing, mainly includes: (1) data deduplication processing, deduplication processing of tweets with basically or completely consistent content; (2) text preprocessing, first of all, sentence segmentation processing, text processing into Sentence level; then perform Chinese word segmentation on the sentence to meet the needs of subsequent analysis;

[0027] Step 3, event message recognition combined with element extraction mainly includes event message recognition based on trigger word matching, time expression recognition, place name entity recognition based on thesaurus, subject extraction based on thesaurus and activity topic extraction.

[0028] S...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an extraction method aiming at a Twitter text event and mainly includes the following steps: step 1, collecting tweet data from a Twitter platform and saving the data into a database; step 2, carrying out duplicated data deleting and text preprocessing; step 3, carrying out event message identification and joint moment extraction, including event message identification based on trigger word matching, time expression identification, toponym entity identification based on lexicon, entity extraction based on lexicon and activity theme extraction. The invention uses the event factor extraction method based on rules, and for each event, marks its event factors, which mainly are event occurrence time, place, entity and activity theme, and makes event extraction to the tweets collected more accurately and realizes fast detection and finding of the event.

Description

technical field [0001] The invention belongs to the field of information technology and relates to a method for extracting Twitter text events. Background technique [0002] A large number of different social events occur in the world every day, which bring pros and cons and different degrees of impact on daily life and social order. Among them, mass protests such as marches, sit-ins, strikes, school strikes, city strikes, and "occupations" often have a greater or lesser impact on social stability, and some even cause turmoil and result in disastrous consequences. Take the "Arab Spring" movement that broke out in North Africa and the Middle East a few years ago as an example. This event triggered two years of turmoil across several countries, leaving countless people displaced, trapped, and even involved in wars and lost their lives. , and its subsequent role is still continuing in the Middle East and North Africa. [0003] Because it is closely related to human life, peop...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/33G06F16/335G06F40/30
Inventor 郭利翔张鑫丁兆云李沛王晖邓经升乔凤才程佳军沈大勇曹建平
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products