An Improved Classification Method of Expanding Feature Vectors of Short Text Words
A technology of eigenvectors and eigenvector sets, which is applied in text database clustering/classification, text database query, unstructured text data retrieval, etc., can solve the problem of few short text eigenvectors, improve classification performance, alleviate The effect of degree of bias
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0048] In order to make the purpose, implementation and advantages of the present invention clearer, the technical solutions of the present invention will be described in detail below in conjunction with the accompanying drawings:
[0049] The classification improvement method of the extended short text word feature vector based on the Word2vec model provided by the present invention, its flow process is as follows figure 1 As shown, it specifically includes the following steps:
[0050] Step 1. Collect the corpus as short text training set and test set. For the short text training set, use the sorted and classified news corpus. The data set includes news headlines and news content. The text uses the original news headline data set as the short text In this dataset, the content dataset is used as the background corpus dataset.
[0051] Step 2. Preprocess the short text corpus including the short text training set, the corpus and the short text test set, including the Chinese ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com