Semantic-based information acquisition method and semantic-based information acquisition system

An information collection and semantic technology, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve problems such as difficulty in presenting multi-dimensional features, single organization mode, poor topic detection effect, etc., to improve browsing efficiency.

Inactive Publication Date: 2013-12-25
TSINGHUA UNIV
View PDF4 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In the related technologies of network resource organization, topic detection can effectively gather and organize scattered network resources. However, due to the high similarity of information in network resources, the effect of topic detection based on traditional vector space models is not good; reasonable network resources The organizational model can better help users understand and analyze the information of network resources. However, the existing organizational model is single, and it is difficult to present its multi-dimensional characteristics.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Semantic-based information acquisition method and semantic-based information acquisition system
  • Semantic-based information acquisition method and semantic-based information acquisition system
  • Semantic-based information acquisition method and semantic-based information acquisition system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0053] In this embodiment, a semantic-based information collection method is firstly provided, such as figure 1 As shown in , the semantic-based information collection method mainly includes steps:

[0054] S1. According to the typical characteristics of network resources, summarize the model elements, and establish an abstract data model of network resources;

[0055] S2. Collect network information from the Internet by means of a search engine, and format the collected network information with the network resource abstract data model;

[0056] S3. Perform cluster analysis on the formatted network information, and divide the network information into corresponding topics according to the cluster analysis results, and extract the label of each topic;

[0057] S4. Visually display the processing results in step S3.

[0058] In addition to this, the following steps can be included:

[0059] S5. Packing and downloading of network information: according to the generated hashtag ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of data mining, in particular to a semantic-based information acquisition method and a semantic-based information acquisition system. The semantic-based information acquisition method comprises the following steps: S1, establishing a network resource abstract data model according to typical characteristics of network resources; S2, acquiring network information from the internet by means of a search engine, and performing formatted processing on the acquired network information by using the network resource abstract data model; S3, performing clustering analysis on the network information after the formatted processing, dividing the network information into a corresponding topic according to a clustering analysis result, and extracting a label of each topic; S4, performing visual display on a processed result in the step S3. According to the semantic-based information acquisition method and the semantic-based information acquisition system provided by the invention, network resource organization, the visual display and downloading and online viewing of the network resources are performed by topic drive, and therefore, the display of the network information can be performed in a multi-dimensional manner, the network information is visually and displayed to a user, and an effect that the browsing efficiency of the user is improved is achieved.

Description

technical field [0001] The invention relates to the technical field of data mining, in particular to a semantic-based information collection method and system. Background technique [0002] Network data (resources) refers to the sum of various information resources on the Internet, including electronic documents, databases, digital documents, digital bibliographies, electronic newspapers, network news, etc. [0003] The data and information on the Internet have the characteristics of large data volume, fast update speed, and strong timeliness. A large amount of network information is generated every day. In order to help users get rid of the "information explosion" Search engine companies will provide a large number of network resources, that is, display Internet information in an all-round and multi-angle manner in one page, introduce the relevant situation of network resources, and analyze their characteristics. Typically, these online materials are organized manually by ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 李涓子祁羽何巍焦程波张鹏杨瑞兵
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products