Method and system for searching mirror-image web page

A web page and mirroring technology, applied in the field of mirroring web pages, can solve problems such as low efficiency of search methods, and achieve the effects of high possibility, narrowing scope and improving efficiency

Active Publication Date: 2010-12-15
SHENZHEN SHI JI GUANG SU INFORMATION TECH
View PDF1 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] The technical problem to be solved by the present invention is to provide a search method for mirrored web pages to solve the inefficiency of search methods in the prior art
[0010] Another object of the present invention is to provide a search system for mirrored webpages, to solve the inefficiency of search methods in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for searching mirror-image web page
  • Method and system for searching mirror-image web page
  • Method and system for searching mirror-image web page

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] In order to make the above objects, features and advantages of the present invention more comprehensible, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0035] A web page contains a hyperlink (URL) pointing to another web page, and it is considered that there is a link relationship between the two web pages. The text on the hyperlink is the anchor text. If webpage A uses anchor text S to link to webpage B, webpage A can be called a parent webpage, and webpage B can be called a child webpage. The link is a forward link for webpage A and a reverse link for webpage B. There may be multiple forward and backlinks for each web page.

[0036] A web page uses a certain anchor text to link to another web page, which can be regarded as a person's appellation, evaluation, and summary of another person in reality. For example, webpage A uses the anchor text "Peking University" to point to w...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a search method of mirror image web pages, which comprises: obtaining reversely linked anchor texts of web pages, computing weights of the anchor texts, according to weights, extracting the anchor texts which are provided with numbers or extracting the anchor texts which are provided with proportions, judging whether the above anchor texts are legal relative to the web pages, determining the web page of the illegal anchor texts and extracting main domains, subdomains, and first page of directory of the web page, then forming a mirror image web pages search set by the extracted web pages and searching mirror image web pages based on the mirror image web pages search set. Simultaneously the invention also provides a search system of mirror image web pages, solves problems of low efficiency of the search method of the prior art, and can simply and rapidly search mirror image web pages and has high efficiency.

Description

technical field [0001] The invention relates to the field of mirrored webpages, in particular to a method and system for searching mirrored webpages. Background technique [0002] Mirror webpages refer to webpages with the same substantive content, for example, webpages with exactly the same display content; webpages with the same body content but different titles; webpages with the same body content but different auxiliary content, etc. Searching for mirror web pages on the Internet can eliminate duplicate web pages, making it easy for users to retrieve and download. At present, in the prior art, mirror webpages are usually found by directly calculating the characteristic values ​​of the webpages, and webpages with the same or similar characteristic values ​​are identified as mirror webpages. [0003] refer to figure 1 , showing the search method for existing mirrored webpages, and the specific steps are as follows. [0004] Step S101 , extract each website's main domain...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 禹荣凌刘云峰熊展志
Owner SHENZHEN SHI JI GUANG SU INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products