Web page text classification algorithm research based on web page link analysis and support vector machine
A technology of support vector machine and web page links, which is applied in the research field of web text classification algorithm, can solve problems such as inconsistent classification results, slow classification speed, and reduced classification accuracy, and achieve less memory requirements, short classification time, and fast learning speed Effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment
[0033] The specific implementation of the SVM topic classification method based on text similarity feedback according to the present invention is as follows: the mmseg4j word segmentation system is adopted, and the training and testing of the SVM model is developed and realized with the e1071 package of R software. The kernel function adopts RBF (RadialBasisFunction). Classify from the webpage of Changsha Dianwei.com, among which gourmet is classified as a specialty, and Hunan cuisine, farm cuisine, home cooking, hot pot, Sichuan cuisine, Cantonese cuisine, snacks, seafood, and private kitchen are classified as 9 subcategories, and 5000 of them are classified as Web pages are used as the training set, and 11,500 web page texts are used as the test set. The preprocessing of the webpage is mainly to segment the webpage, remove the noise information irrelevant to the classification in the webpage, and remove stop words, etc. For example, the content of the webpage text is "This ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com