Chinese notional word extraction algorithm based on semantic comprehension

A semantic understanding, Chinese technology, applied in the field of Chinese real word extraction algorithms, can solve the problems of backward technology of formed products, lower segmentation accuracy, and increased algorithm time complexity, achieve ideal time complexity, improve the process of processing, ideal The effect of segmentation accuracy

Inactive Publication Date: 2017-10-20
成都布林特信息技术有限公司
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] With the gradual maturity of network technology and the Internet, the traditional single keyword method can no longer meet the current content acquisition needs of massive information. How to design a good question answering system has become an important technology that needs to be solved in network search.
From the perspective of the existing question answering system, due to the complexity of word segmentation and the limitations of semantic recognition in Chinese, the technology of formed products is relatively backward. For example, because the existing word segmentation method must first set an initial value of matching word length, If the word length is too long, the time complexity of the algorithm will increase; if the word length is too short, the accuracy of segmentation will decrease
The processing of ambiguous fields cannot meet the needs of actual users

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese notional word extraction algorithm based on semantic comprehension

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] The following and accompanying appendices illustrating the principles of the invention Figure 1 A detailed description of one or more embodiments of the invention is provided together. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details.

[0033] One aspect of the present invention provides a Chinese content word extraction algorithm based on semantic understanding. figure 1 It is a flowchart of an algorithm for extracting Chinese content words based on semantic understa...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a Chinese notional word extraction algorithm based on semantic comprehension. The method includes the steps: building a data retrieval structure by a hash tree dictionary; segmenting a Chinese sentence into short sentences according to a segmentation table, and storing character string matching information in a matching process when participles are matched; judging an existing ambiguous field by the character string matching information and word-by-word scanning, and processing pre-segmented intermediate results in a segmenting process. The Chinese notional word extraction algorithm based on semantic comprehension improves the ambiguous field processing course of the participles, and has more ideal time complexity and segmentation accuracy.

Description

technical field [0001] The invention relates to natural language processing, in particular to a Chinese content word extraction algorithm based on semantic understanding. Background technique [0002] With the gradual maturity of network technology and the Internet, the traditional single keyword method can no longer meet the current content acquisition needs of massive information. How to design a question answering system has become an important technology that needs to be solved in network search. From the point of view of the existing question answering system, due to the complexity of word segmentation and the limitations of semantic recognition in Chinese, the technology of formed products is relatively backward. For example, because the existing word segmentation method must first set an initial value of matching word length, If the word length is too long, the time complexity of the algorithm will increase; if the word length is too short, the accuracy of segmentatio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F17/30
CPCG06F16/90332G06F40/289G06F40/30
Inventor 张鹏
Owner 成都布林特信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products