Method for identifying re-loading relation between internet news texts
A technology of relationship recognition and the Internet, applied in the field of Internet technology/data mining, to achieve the effects of efficient processing, noise resistance, and efficient identification
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0033] Such as figure 1 Shown:
[0034] First, an offline HTML page is input. Visually, an HTML page can be divided into several independent blocks (regions), and each block displays different information. For example: A common HTML page contains the following blocks: top navigation bar, related links, body section, comments, bottom site links, etc. Details are attached in the accompanying drawings figure 2 shown.
[0035] For an HTML page, a theme content block refers to a text area containing events described on the page, which can be understood as a "text" part. For example, in addition to describing the news itself, a news web page often also contains a large amount of navigation information, related news links, advertisement information, comment information and so on.
[0036] Web page preprocessing, that is, the extraction of topic content blocks, is to remove useless structural information and noise content in web pages, extract the text part of the narrative event...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com