Webpage text content extracting method and device
An extraction method and technology of an extraction device are applied in the field of webpage text content extraction, which can solve the problems of low accuracy of webpage text content and achieve the effect of improving the accuracy.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0021] The main realization principles, specific implementation modes and corresponding beneficial effects of the technical solutions of the embodiments of the present invention will be described in detail below in conjunction with each accompanying drawing.
[0022] Such as figure 2 As shown, it is a flowchart of a method for extracting webpage text content in an embodiment of the present invention, and its specific processing flow is as follows:
[0023] Step 21, obtaining two web pages belonging to the same hierarchical directory under the same site;
[0024] The embodiment of the present invention proposes that different pages of the same hierarchical directory under the same site are usually generated by the same hypertext markup language (HTML, Hyper Text Mark-up Language) template, so pages under the same hierarchical directory under the same site The webpage structure between different webpages is the same or similar. For example, different pages of the same hierarch...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com