Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Keyword extraction method and system for financial and economic messages

An extraction method and keyword technology, which is applied in the field of keyword extraction of financial newsletter, can solve the problems of inaccurate keywords, high misclassification ratio, and low recall rate of keyword algorithms

Active Publication Date: 2021-03-16
新华智云科技有限公司
View PDF3 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] At present, most text keyword extraction algorithms are based on unsupervised algorithms. Existing keyword extraction methods include: keyword extraction methods based on statistical features, keyword extraction methods based on word graph features, and keyword extraction methods based on topic models And the combination of the above keyword extraction methods. However, the existing keyword extraction methods rely heavily on the performance of the Chinese word segmenter, and the Chinese word segmenter has a high proportion of misclassification of proper nouns in the financial field, and the extracted keywords are not accurate. For For short texts such as financial newsletters or even ultra-short texts of more than a dozen characters, the text statistical features, word map features, and topic features used by existing solutions are relatively weak, and the keywords extracted by using existing solutions cannot effectively express financial newsletters The core purpose of the keyword algorithm leads to a low recall rate of the keyword algorithm

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Keyword extraction method and system for financial and economic messages
  • Keyword extraction method and system for financial and economic messages
  • Keyword extraction method and system for financial and economic messages

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] The following description serves to disclose the present invention to enable those skilled in the art to carry out the present invention. The preferred embodiments described below are only examples, and those skilled in the art can devise other obvious variations. The basic principles of the present invention defined in the following description can be applied to other embodiments, variations, improvements, equivalents and other technical solutions without departing from the spirit and scope of the present invention.

[0031] Those skilled in the art should understand that in the disclosure of the present invention, the terms "vertical", "transverse", "upper", "lower", "front", "rear", "left", "right", " The orientation or positional relationship indicated by "vertical", "horizontal", "top", "bottom", "inner", "outer", etc. is based on the orientation or positional relationship shown in the drawings, which are only for the convenience of describing the present invention...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a keyword extraction method and system, and the method comprises the following steps: obtaining financial short message text data, and marking a financial text; inputting the labeled text data into a pre-trained convolutional neural network to obtain font embedding feature vectors of text data characters; inputting the labeled text data into a pre-trained RoBerta-wwm model,and obtaining a semantic embedding feature vector of a text data character; carrying out splicing and dimensionality reduction on the font embedded feature vector and the semantic embedded feature vector, and obtaining a combined character feature vector; inputting the combined character feature vector into a conditional random field layer, and obtaining an output character label by adjusting training parameters; and performing extraction of keywords from character tags. According to the method and system, a Chinese RoBerta-wwm prediction model is adopted to represent character vectors of financial short message texts, representation is carried out in combination with five-stroke characteristics of Chinese, and the extraction accuracy of keywords can be improved in combination with five-stroke font characteristics of the Chinese.

Description

technical field [0001] The invention relates to the field of artificial intelligence, in particular to a method and system for extracting keywords from financial newsletters. Background technique [0002] At present, most text keyword extraction algorithms are based on unsupervised algorithms. Existing keyword extraction methods include: keyword extraction methods based on statistical features, keyword extraction methods based on word graph features, and keyword extraction methods based on topic models And the combination of the above keyword extraction methods. However, the existing keyword extraction methods rely heavily on the performance of the Chinese word segmenter, and the Chinese word segmenter has a high proportion of misclassification of proper nouns in the financial field, and the extracted keywords are not accurate. For For short texts such as financial newsletters or even ultra-short texts of more than a dozen characters, the text statistical features, word map ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/951G06F16/955G06F40/216G06F40/30G06N3/04G06N3/08
CPCG06F16/951G06F16/955G06F40/30G06F40/216G06N3/08G06N3/045Y02D10/00
Inventor 李明玉
Owner 新华智云科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products