Chinese domain term recognition method based on mutual information and conditional random field model
A conditional random field and recognition method technology, applied in the information field, can solve the problems of low degree of automatic recognition, low recognition accuracy, accurate word segmentation of corpus in difficult professional fields, etc.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0053] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.
[0054] In this embodiment, the field term recognition of plant-bamboo is taken as an example to illustrate the present invention, but it is not used to limit the scope of the present invention.
[0055] refer to figure 1 , the Chinese field term recognition method based on mutual information and conditional random field model of the present invention, comprises the following steps:
[0056] (1) Collect domain text corpus, and mark all punctuation marks, spaces, numbers, ASCII characters and characters other than Chinese characters in the corpus.
[0057] For example, this example selects the electronic manuscript of the ninth volume of Bamboo subfamily of "Flora of China" as the domain text corpus.
[0058] First, the corpus is randomly divided into two parts according to the ratio of 4:1: training corpus and test corpus;
[00...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com