Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Ancient poem automatic identification method based on deep learning

An automatic recognition and deep learning technology, applied in character and pattern recognition, instruments, network data indexing, etc., can solve problems such as gradient disappearance, lower classification accuracy, and easy to be misidentified as modern Chinese, so as to reduce interference and reduce The effect of word segmentation operation

Active Publication Date: 2019-08-30
FOCUS TECH
View PDF3 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the deep learning model applied to Chinese text classification has the following defects: (1) Typos in the text will reduce the classification accuracy
It is especially obvious in the task of automatic recognition of ancient poems and prose. Typos in the words of ancient poems and prose are easily misrecognized as modern Chinese; (2) the text length feature can easily mislead the text classifier to give wrong results, such as ancient poems and prose in the ancient poems and prose recognition task. The length is usually about five to ten, and the text classifier is likely to mistakenly distinguish the sentence of this length as the type of ancient poetry; (3) usually the text classification model needs to learn the feature distribution of the word vector, and the accuracy and granularity of the word segmentation are limited. The performance of the text classification model, besides, ancient poems are not suitable for word segmentation; (4) it is very easy to overfit on a small data set, resulting in a decrease in accuracy; (5) it is difficult to train an effective deep neural network model, and it is easy to appear during the training process. The problem of gradient explosion and gradient disappearance makes the model unable to converge

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Ancient poem automatic identification method based on deep learning
  • Ancient poem automatic identification method based on deep learning
  • Ancient poem automatic identification method based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] A method for automatic recognition of ancient poems and prose based on deep learning, characterized in that it includes the steps of collecting training corpus, data preprocessing, feature vector embedding, neural network training, and automatic recognition of ancient poems and prose, specifically:

[0047] Step 1, collect training corpus: use the crawler program to crawl ancient poems provided by Internet sites as a positive sample set; collect modern Chinese sentence corpus as a negative sample set; count the length of ancient poems and sentences in the positive sample set, according to the concentration of sentence length in the positive sample set Distribution range, select the sentence length distribution value of more than 95% of the sentences, and use this value to modify the distribution of sentence length in the negative sample set;

[0048] Count the number of sentences in the positive sample set and the negative sample set. If the numbers are not equal, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an ancient poem automatic identification method based on deep learning, and the method is characterized in that the method comprises the steps:by including collecting trainingcorpora, carrying out the data preprocessing, carrying out the embedding of feature vectors, carrying out the training of a neural network, and carrying out the automatic identification of ancient poems. The deep neural network model in the text classification form is constructed, whether the text sentences are ancient poetry types or not is automatically recognized, and meanwhile it can be effectively avoided that wrongly written characters are wrong, and the recognition accuracy is reduced. And the requirements of application scenes such as poetry quality detection, literature work classification management and automatic collection of ancient poetries on an ancient poetry automatic identification technology can be met.

Description

technical field [0001] The invention relates to the field of natural language processing, in particular to an automatic recognition method for ancient poetry based on deep learning. Background technique [0002] In recent years, the integration of natural language processing technology with linguistics and literature has become increasingly close. Text classification has been effectively applied to the automatic recognition of emotions and intentions in human language, but it is rarely applied to automatic recognition of ancient poetry. Many application scenarios have the demand for automatic recognition technology of ancient poetry and prose, such as poetry quality inspection: to detect the quality of ancient poetry and prose works written by human workers and programs; classification management of literary works: to detect ancient poetry and prose works and modern literary works Automatic classification; the program automatically collects a large number of ancient poems an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/46G06K9/62G06F16/951G06F17/27
CPCG06F16/951G06V10/454G06F40/30G06F18/214G06F18/24
Inventor 张灿殷亚云
Owner FOCUS TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products