Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and equipment for extracting core keywords based on query sequence cluster

A technology for querying sequences and extracting equipment, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of limited number, inability to extract core keywords, and the speed of dictionary update is low, and the speed of new word update, etc. Achieving the effect of a good search experience

Active Publication Date: 2011-05-04
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF3 Cites 44 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] Most of the existing word segmentation technologies use dictionaries or semantic analysis to segment sentences or fragments. However, the number of words included in the dictionary is limited, and new words emerge in an endless stream on the Internet, and the update speed of the dictionary is much lower than that in the Internet. The speed of updating new words makes it impossible to meet actual needs according to dictionary word segmentation
Segmentation of sentences or fragments based on semantic analysis involves machine learning, and the diversification of language expressions and colloquial Internet language in the Internet make the result of word segmentation through semantic analysis unsatisfactory.
[0003] When there are a large number of search needs for the same search results clicked by users on the network, these search needs often reflect the same theme, but due to the different expressions of search users, semantic analysis often cannot correctly reflect this search demand. At the same time, the search demand for the search results clicked by a large number of the same users is often a hot spot at that time. Therefore, the core keyword corresponding to the search demand may also be a new word that does not exist in the dictionary, which leads to It is also impossible to extract the corresponding core keywords from these search requirements according to the dictionary

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and equipment for extracting core keywords based on query sequence cluster
  • Method and equipment for extracting core keywords based on query sequence cluster
  • Method and equipment for extracting core keywords based on query sequence cluster

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The present invention will be described in further detail below in conjunction with the accompanying drawings.

[0020] figure 1 It is a schematic diagram of a device according to one aspect of the present invention, showing a device for extracting core keywords based on query sequence clusters. Wherein, the extraction device 1 includes an acquisition device 11 and an extraction device 12 . Specifically, the acquisition means 11 acquires a query sequence cluster, wherein the query sequence cluster includes a plurality of query sequences, wherein each query sequence corresponds to at least one search result clicked by the same user; subsequently, the extraction means 12 extracts from the query sequence The core keywords corresponding to the query sequence clusters are extracted from the clusters. Here, the extracting device 1 includes but not limited to a search engine server or a dedicated server connected thereto, etc. Those skilled in the art should understand that ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention aims to provide a method and equipment for extracting core keywords based on a query sequence cluster. The method comprises the following steps of: acquiring the query sequence cluster by the extracting equipment, wherein the query sequence cluster comprises a plurality of query sequences and each query sequence corresponds to at least one same user clicked search result; and extracting the core keywords corresponding to the query sequence cluster from the query sequence cluster. Compared with the prior art, search requirements of users of the query sequences input to the query sequence cluster are acquired, and more appropriate search suggestions or more relevant search results or the like can be supplied to the users according to the core keywords, so that the users acquire better search experience. Furthermore, when a lexicon does not contain the core keywords, the core keywords can be used as new words and added into the lexicon for each application.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a technology for extracting core keywords based on query sequence clusters. Background technique [0002] Most of the existing word segmentation technologies use dictionaries or semantic analysis to segment sentences or fragments. However, the number of words included in the dictionary is limited, and new words emerge in an endless stream on the Internet, and the update speed of the dictionary is much lower than that in the Internet. The speed at which new words are updated makes word segmentation according to the dictionary unable to meet actual needs. Segmentation of sentences or fragments based on semantic analysis involves machine learning, and the diversification of language expressions and the colloquialism of Internet languages ​​in the Internet make the results of word segmentation through semantic analysis unsatisfactory. [0003] When there are a la...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 张超忻舟王强
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products