Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and equipment for determining text similarity

A technology for text similarity and determination method, which is applied in the field of text similarity determination methods and equipment, and can solve the problems of low similarity in accuracy, inability to reflect text similarity, inaccuracy, etc.

Active Publication Date: 2018-05-29
SOUTH CHINA NORMAL UNIVERSITY
View PDF6 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, if the comprehensive information in the text is ignored, for example, text 1 "I chased a dog today" and text 2 "a dog chased me today", the meanings of these two text sentences are opposite, but according to the current absolute For most similarity algorithms, the word segmentation in the two texts is almost the same, so it is obviously inaccurate to determine that the similarity between the two texts is high, or even the same.
[0004] It can be seen that the accuracy of the similarity obtained by the current text similarity calculation method is low, and cannot reflect the similarity of the text itself.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and equipment for determining text similarity
  • Method and equipment for determining text similarity
  • Method and equipment for determining text similarity

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0086] In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, the present invention can be implemented in many other ways different from this description, and those skilled in the art can make similar extensions without violating the connotation of the present invention, so the present invention is not limited by the specific embodiments disclosed below.

[0087] The accuracy of the similarity obtained by the current text similarity calculation method is low, and cannot reflect the similarity of the text itself.

[0088] In view of this, the embodiment of the present invention provides a new method for determining text similarity. The determination method comprehensively considers the grammatical similarity and topic similarity between two texts to determine the similarity between texts in the coming year. Compared with the prior art, the similarity between two texts is determined on...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and equipment for determining text similarity. The method includes: acquiring first text and second text whose similarity needs to be determined; determining the grammar similarity and theme similarity of the first text, and determining the grammar similarity and theme similarity of the second text; determining the similarity between the first text and the second text according to the determined grammar similarity and theme similarity. By the method and equipment, text similarity can be accurately reflected.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a method and device for determining text similarity. Background technique [0002] In the prior art, the similarity between two texts is judged by segmenting the two texts, and then judging the repeated words in the two texts. [0003] However, if the comprehensive information in the text is ignored, for example, text 1 "I chased a dog today" and text 2 "a dog chased me today", the meanings of these two text sentences are opposite, but according to the current For most similarity algorithms, the word segmentation in the two texts is almost the same, so it is obviously inaccurate to determine that the similarity between the two texts is high, or even the same. [0004] It can be seen that the accuracy of the similarity obtained by the current text similarity calculation method is low, and cannot reflect the similarity of the text itself. Contents of the invention [0005] In...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27
CPCG06F40/211G06F40/289G06F40/30
Inventor 周春郑百成黄妍明方永毅瞿荣蒋运承
Owner SOUTH CHINA NORMAL UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products