Multi-document automatic abstract generation method based on phrase subject modeling
A topic modeling and automatic summarization technology, applied in natural language data processing, special data processing applications, instruments, etc., can solve the problem that the influence of automatic summarization cannot be ignored
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0057] In order to better understand the technical scheme of the present invention, the following in conjunction with the attached figure 1 The present invention is further described.
[0058] The specific steps of this example implementation example are as follows:
[0059] 1) Preprocessing sample multi-documents: Use the Mallet natural language processing tool to segment the documents to obtain phrases and their frequency of occurrence (the length of the phrase is limited to no more than 3), and stop words (such as the, this) need to be removed during this process , invalid words (such as wepurpose), and then construct a word vector space.
[0060] 2) Phrase topic modeling: Based on the LDA topic model, phrases are used instead of words as the object of calculation, the joint probability distribution of documents is calculated, and transformed into the phrase topic model. The schematic diagram of the phrase topic model is as follows figure 2 As shown, and then use the Gib...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com