Method and device for calculating sentence similarity and method and device for machine translation
A technique of sentence similarity and similarity, applied in the computer field
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0095] figure 1 The flow chart of the method for calculating sentence similarity provided by Embodiment 1 of the present invention, such as figure 1 As shown, the method may include the following steps:
[0096] Step 101: Compare the sentence E1 and the sentence E2, and determine the difference word pairs.
[0097] The embodiment of the present invention is based on basic text processing of sentences, such as word segmentation, alignment, etc. Since this part of the content is prior art, it will not be repeated here.
[0098] Compare the words in sentence E1 and sentence E2, and determine that different words constitute different word pairs, for example:
[0099] Sentence E1 is: Can I take a picture of the painting?
[0100] Sentence E2 is: Can we take a photo of the painting?
[0101] Then determine that the difference word pair is: the difference word pair formed by I and we, the difference word pair formed by picture and photo.
[0102] Step 102: Use the collocation pr...
Embodiment 2
[0117] figure 2 The flow chart of the method for calculating sentence similarity provided by Embodiment 2 of the present invention, such as figure 2 As shown, the method may include the following steps:
[0118] Step 201 is the same as step 101 in the first embodiment.
[0119] Step 202 is the same as step 102 in the first embodiment.
[0120] Step 203: Determine the feature vectors of the two different words in the difference word pair, and use the feature vectors of the two different words to calculate the similarity distance between the two different words.
[0121] In the second embodiment, the degree of similarity of the difference words in a specific corpus can be further considered, and the degree of similarity is reflected by the distance between the feature vectors of the two difference words in the difference word pair.
[0122] The feature vector of the difference word can be composed of words that have a higher collocation probability with the difference word....
Embodiment 3
[0140] image 3 The flow chart of the machine translation method provided by Embodiment 3 of the present invention, such as image 3 As shown, the method may include the following steps:
[0141] Step 301: Calculate the similarity between the sentence to be translated and the sentence in the preset example sentence database.
[0142] In this step, the method described in Embodiment 1 or Embodiment 2 can be used to calculate the similarity between the sentence to be translated and the sentence in the example sentence database, so as to prepare for further selection of similar example sentences.
[0143] Because the number of example sentences in the example sentence bank is very large, if the method shown in embodiment one or embodiment two is used to calculate the similarity between each sentence in the example sentence bank and the sentence to be translated one by one, then the efficiency will be low. In order to improve efficiency, you can first calculate The edit distance...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com