Theme information-based text segmentation method
A cutting method and subject information technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of inconvenient research and inconvenient reading, and achieve the effect of convenient retrieval
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0036] The present invention will be further described below in conjunction with accompanying drawings and examples.
[0037] Such as figure 1 As shown, a text segmentation method based on topic information can be divided into the following five processes:
[0038] Step 1, preprocessing the input text and the training set to obtain a series of sentences composed of words; includes two steps.
[0039] 101. For the input text, divide it according to the ending punctuation mark, and the ending punctuation mark refers to all symbols that can be used at the end of a Chinese sentence; obtain a series of separate sentences, each sentence occupies a separate line, and for the training set, its format is: sentence-topic tag, where both the sentence and the topic tag are Chinese text. The sentence part wherein carries out above-mentioned operation.
[0040] 102. Segment individual sentences and remove numbers, stop words, punctuation marks, and non-Chinese special symbols. Get a seque...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com