Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Improved lexical semantic similarity solution algorithm

A technology of lexical semantics and similarity, applied in the field of semantic network, can solve problems such as errors, complex methods, and large amount of calculations

Inactive Publication Date: 2017-05-03
SICHUAN YONGLIAN INFORMATION TECH CO LTD
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In addition, this method can accurately and effectively measure the semantic similarity between words, but this method is too dependent on the corpus used for training, the amount of calculation is large and the method is relatively complicated, and it is greatly disturbed by data sparseness and data noise. Sometimes there will be obvious mistakes, based on meeting the above requirements, the present invention provides an improved algorithm for solving lexical semantic similarity

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Improved lexical semantic similarity solution algorithm
  • Improved lexical semantic similarity solution algorithm
  • Improved lexical semantic similarity solution algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] To solve the word (c 1 , c 2 ) between the semantic similarity problem, combining figure 1 The present invention has been described in detail, and its specific implementation steps are as follows:

[0017] Step 1: Initialize the statistical method module, which can be corpora such as "Word Dictionary", "Ci Lin", "HowNet", "Baidu Encyclopedia" and so on.

[0018] Step 2: the word to be compared (c 1 , c 2 ) into the initial statistical method module.

[0019] Step 3: Find the words to be compared in the statistics module (c 1 , c 2 ) the context word with the greatest weight in the adjacent context (c sx1 ,c sx2 ).

[0020] Find the word to be compared (c 1 , c 2 ) corresponding to the context word with the largest weight in the corpus (c sx1 ,c sx2 ), the specific calculation process is as follows:

[0021] Context words are searched according to constraints. For example, in Chinese, part-of-speech pairs with relatively strong context constraints include: ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides an improved lexical semantic similarity solution algorithm. On the basis of a successfully initialized statistical method module, a maximum keyword is extracted by comparing context words with the maximum weights of words to be compared and the similarity of the words to be compared, and finally the similarity between the extracted maximum keyword and the words to be compared is calculated to obtain a result. The calculation result of the semantic similarity is basically consistent with actually manually judged semantic similarity; the objective reality is better reflected; and the user demands are better satisfied.

Description

technical field [0001] The invention relates to the technical field of semantic network, in particular to an improved algorithm for solving lexical semantic similarity. Background technique [0002] Since the 21st century, the global Internet industry has entered a new period of rapid development, and various new technologies are constantly emerging. Natural language processing, an important technology connecting computers and people, has also achieved rapid development. Traditional semantic correlation calculation methods can be roughly divided into two categories: semantic dictionary-based semantic similarity calculation methods and corpus-based semantic similarity calculation methods. Methodology; Statistical-based studies of word semantic similarity build on observable linguistic facts rather than relying solely on linguists' intuitions. It is based on the assumption that two words are semantically similar if and only if they are in a similar context, and uses a large-...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27
CPCG06F40/284
Inventor 金平艳
Owner SICHUAN YONGLIAN INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products