Short text clustering-based labeling system and method
A text clustering algorithm and short text technology, applied in the field of labeling systems based on short text clustering, can solve problems such as poor accuracy of results, low labeling efficiency, and high communication costs, to ensure correctness, save communication costs, and improve labeling efficiency effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0047] Such as figure 2 As shown, a specific embodiment of the present invention discloses a tagging system based on short text clustering, including an input module, a text clustering algorithm module, a result display module, a fast tagging module, and an output module.
[0048] Optionally, in this embodiment, the input module, the text clustering algorithm module, the result display module, the quick labeling module, and the output module are connected sequentially, and other connection methods can also be used to achieve the same effect. Those skilled in the art can understand this technical solution , which will not be repeated here.
[0049] The labeling system based on short text clustering is set on the computer. The input module receives the text to be processed (file to be processed) imported by the user, and converts the text to be processed into at least one single-line subtext. After this operation, each line represents a single-line subtext to be processed, and...
Embodiment 2
[0058] Such as image 3 As shown, based on the optimization of the above-mentioned embodiments, the tagging system based on short text clustering may also include a multi-text alignment module. Optionally, in this embodiment, the multi-text alignment module is placed between the text clustering algorithm module and the result display module, or it can be placed in other places to implement corresponding functions, as those skilled in the art can understand, so it will not be repeated here.
[0059] The multi-text alignment module is used to vertically align all single-line subtexts in each group output by the text clustering algorithm module, that is, to place the same text in different single-line subtexts on the same column as much as possible, and align the results (each group Vertically aligned single-line subtext) is sent to the result display module for visual display in groups, so that users can quickly browse all text information vertically.
[0060] Optionally, the i...
Embodiment 3
[0074] Such as Image 6 As shown, this embodiment provides a method for labeling using the labeling system based on short text clustering described in Example 2, including the following steps:
[0075] S1. In the input module, the input text to be processed is preprocessed. Convert the text to be processed into at least one single-line subtext, perform emptying and deduplication processing on all single-line subtexts, remove empty text and single-line subtexts that are identical in text, and after this step, the content of each single-line subtext It must be different.
[0076] S2. In the text clustering algorithm module, perform clustering algorithm analysis on all single-line subtexts, use hierarchical clustering algorithm to place similar single-line subtexts in the same group, and compare the grouping results and each group by modifying the edit distance Adjust the text similarity of the single-line subtext within to compress the reading volume of the text.
[0077] S3....
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com