Web page cleaning method based on web page content
A web page content and web page technology, applied in the field of web page cleaning based on web page content, can solve the problems of lack of versatility and uncertain HTML structure of web pages, and achieve the effect of strong versatility
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0022] In this embodiment, it is assumed that there are two web pages A and B, and A is cleaned: The HTML of A is:
[0023] Title of A
[0024] advertising
[0025] Content A
[0026] Link to B The HTML of B is:
[0027] Title of B
[0028] advertising
[0029] Content B
[0030] The following takes pages A and B as examples to explain the cleaning steps in detail:
[0031] 1. Use the web page download component to download the web page to be cleaned from the Internet through the computer network adapter. In this step, the non-text content of the web page has been cleared, such as script codes, html tags, etc.
[0032] In this embodiment, we obtain the title "title of A" and content "advertisement", "content A" and url link to B "link to B" from page A.
[0033] 2. Use the web page downloading component and the url list obtained in step 1 to download from the Internet a web page that has a first-level hyperlink relationship with the web page to be cleaned through a com...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com