Natural language theme classification method and device
A natural language and topic classification technology, applied in neural learning methods, text database clustering/classification, text database query, etc., can solve the problems of unable to achieve adaptive feature selection, classification accuracy limitations, etc., to improve classification accuracy , the effect of avoiding feature dependencies
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0060]This embodiment provides a method for classifying natural language topics. As we all know, the classification of natural language topics is one of the contents that students must master at present. For example, the classification of themes of ancient poems can help students understand the main idea of ancient poems. Therefore the scheme of the present application can play the effect of auxiliary teaching.
[0061] The natural language topic classification method includes: a training stage and a classification stage.
[0062] figure 1 It is a flow chart of the training phase in the natural language topic classification method of Embodiment 1.
[0063] see figure 1 This training phase includes:
[0064] Step 101: Obtain natural language text segments of known topics as a sample set.
[0065] Step 102: Extracting multiple words with the highest frequency of occurrence in the sample set to obtain multiple feature words; specifically: using the Sunday algorithm to ret...
Embodiment 2
[0088] In this embodiment 2, the technical solution of the present invention is described in detail by taking an ancient poem text as an example.
[0089] As a special type of natural language text, ancient poetry texts are different from modern texts in sentence structure, format, and expression, and their content is implicit, obscure, and extremely refined. In addition, monosyllabic words account for the majority of ancient poems, and this feature also brings a lot of problems to the selection of features. The present invention forms the most efficient feature spectrum (feature spectrum is the collection of a plurality of selected features) by adaptively selecting the most useful feature for text classification. Since the classification task is completed according to feature selection, the selection of feature It should be affected by the completion of the final task, that is, the quality of the classification directly affects the selection of features, so it is very suitabl...
Embodiment 3
[0123] This embodiment 3 provides a kind of natural language subject classification device, comprising:
[0124] A sample acquisition device, configured to acquire a natural language text segment of a known topic as a sample set;
[0125] A high-frequency word extraction device is used to extract a plurality of words with the highest frequency of occurrence in the sample set to obtain a plurality of feature words;
[0126] A vector representation device, configured to represent each of the feature words as a vector to obtain a plurality of feature vectors;
[0127] A similarity calculation device is used to calculate the similarity between any two feature vectors to obtain a similarity set; the similarity set reflects the characteristics and connections of multiple feature vectors;
[0128] The training and classification device is used to input the degree of similarity, the theme and the feature words corresponding to each theme into the preset neural network structure for t...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com