Method for extracting features based on distributed mutual information documents
An extraction method and document feature technology, applied in the field of document feature extraction based on distributed mutual information, can solve the problems of data processing scale limitation and insufficient performance
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0042] The present invention will be described in further detail below in conjunction with the accompanying drawings.
[0043] Such as figure 1 A method for feature extraction based on distributed mutual information documents is provided, the method includes the following steps:
[0044] Step 1: Collect documents and initialize documents;
[0045] Step 2: Calculate the frequency of word segmentation in the document and the mutual information value of word segmentation in different categories, so as to select the set of feature words;
[0046] Step 3: Calculate the weights of all feature words to form the final document vector set.
[0047] In the step 1, initializing the document includes word segmentation simplification and distributed representation of the document.
[0048] Described step 1 comprises the following steps:
[0049] Step 1-1: Let D={d 1 , d 2 ,...,d j ,...,d N} represents the corpus, d j Represents each document in the corpus, and N represents the num...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com