An article deduplication method, device and equipment and a storage medium
A technology of articles and equipment, applied in the field of data processing, can solve problems such as poor results
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0054] In the existing technical solutions for deduplication of article data, most of them use the URL of the article data (specifically, the character string of the URL) to deduplicate the article data, but the article data in the article data obtained by this deduplication method The repetition rate is high, that is, there are still many articles with consistent content in the deduplication article data, and the deduplication effect is poor.
[0055]The inventor found through research that there is not a one-to-one correspondence between the URL and the content of the article. Specifically, for the same article, it may exist in multiple locations on the network, for example, an article may be published on multiple network platforms, etc., which makes an article actually correspond to multiple different URL. Then, when deduplication is performed on the article data based on the URL, although the URLs are different, the content of the corresponding articles is still the same,...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com