Text clustering intelligent evaluation method based on hybrid clustering
A text clustering and clustering method technology, applied in the field of text clustering, can solve the problems of slow running speed, redundant feature words, and large impact on the quality of document sets.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0064] Such as figure 1 As shown, this embodiment provides a method for intelligent evaluation of text clustering based on hybrid clustering, including the following steps:
[0065] S1: Perform data preprocessing on the original text set X: including word segmentation, removal of stop words, etc., to obtain all feature words D in the original text set;
[0066] S2: Perform the first feature selection, that is, delete the feature words with particularly high and low document frequency (DF) for D according to the set ratio, and obtain the feature subset D′ after rough selection, reduce feature redundancy, and reduce feature words by reducing Feature redundancy can reduce feature dimension and improve clustering accuracy. In this example, the maximum DF is set to 0.15, and the minimum DF is set to 0.0002.
[0067] S3:: use the TF-IDF method to calculate the corresponding weights of all texts in the original text set X, and express all the texts in the original text set X as vec...
Embodiment 2
[0185] This embodiment provides a computer-readable storage medium. The storage medium can be a storage medium such as ROM, RAM, magnetic disk, and optical disk. The storage medium stores one or more programs. When the programs are executed by the processor, the embodiment is realized. 1's intelligent evaluation method for text clustering based on hybrid clustering.
Embodiment 3
[0187] This embodiment provides a computing device, and the computing device may be a desktop computer, a notebook computer, a smart phone, a PDA handheld terminal, a tablet computer or other terminal devices with a display function, and the computing device includes a processor and a memory, the memory stores one or more programs, and when the processor executes the programs stored in the memory, the hybrid clustering-based text clustering intelligent evaluation method of Embodiment 1 is implemented.
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com