A Dynamic Crawling Method Based on Viterbi Algorithm for Web Page Classification and Sorting
A Viterbi algorithm and web page classification technology, applied in the field of network data mining, can solve the problems of low accuracy and low crawler efficiency, and achieve the effect of accurate acquisition, increased efficiency and accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0048] Embodiment 1: as Figure 1-9 Shown, a kind of dynamic crawler method of classifying and sorting web pages based on Viterbi algorithm, the specific steps of said method are:
[0049] Step1. Obtain the link relationship network; first obtain any webpage related to the topic as the seed URL, and obtain the chain child links by crawling the hyperlinks of the seed webpages, and obtain the relationship diagram between the parent link and the child link. The link structure flow diagram is as follows figure 2 shown;
[0050] Step2. Calculate the value LV of webpage links;
[0051] Step2.1. Calculate the value LV of the web page link. The formula for calculating LV is:
[0052] Among them, LN is the current number of incoming links of the webpage; the number of incoming links is a dynamic value. Through the continuous deepening of crawlers, the number of incoming links of some webpages will increase and gradually approach the number of incoming links of webpages in the real...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com