Method and system for automatically reconstructing site map of website
A sitemap, automatic reconstruction technology, applied in special data processing applications, instruments, electronic digital data processing and other directions, can solve the problem that the site map is not timely and comprehensive enough, and achieve the effect of improving SEO friendliness
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0034] figure 1 It is a flowchart of a method for automatically reconstructing a site map of the present invention. The method specifically includes the following steps:
[0035] S1. Collection of website pages: collecting website pages in order from the homepage of the website in a breadth-first manner, collecting at most N levels (for small websites, N=4; for large websites, N=5). Attention should be paid to large-scale commercial websites, shielding a large number of user communication areas such as bbs, and avoiding large amounts of crawler collection and waste on invalid web pages.
[0036] S2. Perform digital identification extraction for each collected webpage to obtain the unique digital identification DOM_ID of each webpage, and use the key-value pair Save in the way of categorization to obtain the website page information collection MAP, where DOM_ID is the unique digital identifier of the page, PAGEs is the description information list of the page, each item in the list ...
Embodiment 2
[0052] figure 2 A system for automatically reconstructing a website site map provided by the present invention, the system specifically includes the following contents:
[0053] Website webpage collection module;
[0054] Website webpage information collection generation module: extract the digital identification of each collected webpage, obtain the unique digital identification DOM_ID of each webpage, and use the key-value pair Save in the way of categorization to obtain the website page information collection MAP, where PAGEs is a list of description information of the page; each item in the list is a PAGE, PAGE is a description of the page information, PAGE=[url, anchor, depth ,Referer], url is a webpage link, referer is the url of the previous webpage linking to the current page, anchor is the text anchor of the current page on the referer page, depth is the depth of the current webpage;
[0055] Column object list determination module of the website: use the judgment rules t...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com