Method for constructing Vietnamese dependency tree bank on basis of Chinese-Vietnamese vocabulary alignment corpora
A technology of word alignment and corpus, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve the problems of building Vietnamese dependency treebanks such as scarcity and difficulty in syntactic analysis of dependencies, so as to save manpower and build treebanks The effect of time and accuracy improvement
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0025] Embodiment 1: as Figure 1-3 Shown, a kind of method based on Chinese-Vietnamese word alignment corpus constructs Vietnamese language dependency tree bank, described concrete steps of the method for building Vietnamese language dependency tree bank based on Chinese-Vietnamese word alignment corpus are as follows:
[0026] Step1. First, build a Chinese-Vietnamese word alignment parallel sentence pair database;
[0027] Step1.1, first collect Chinese-Vietnamese parallel sentence pairs;
[0028] Step1.2. Construction of Chinese-Vietnamese parallel sentence pair library with word alignment; use GIZA++ for word alignment training on Chinese-Vietnamese parallel sentence pairs, and then obtain Chinese-Vietnamese word-aligned parallel sentence pair library through manual adjustment;
[0029] Step2. Build a Chinese dependency tree corpus;
[0030] Step2.1. Perform Chinese sentence segmentation processing on the Chinese-Vietnamese word alignment parallel sentence pair library; ...
Embodiment 2
[0036] Embodiment 2: as Figure 1-3 Shown, a kind of method based on Chinese-Vietnamese word alignment corpus constructs Vietnamese language dependency tree bank, described concrete steps of the method for building Vietnamese language dependency tree bank based on Chinese-Vietnamese word alignment corpus are as follows:
[0037] Step1. First, build a Chinese-Vietnamese word alignment parallel sentence pair database;
[0038] Step1.1, first collect Chinese-Vietnamese parallel sentence pairs;
[0039] Step1.2. Construction of Chinese-Vietnamese parallel sentence pair library with word alignment; use GIZA++ for word alignment training on Chinese-Vietnamese parallel sentence pairs, and then obtain Chinese-Vietnamese word-aligned parallel sentence pair library through manual adjustment;
[0040] Step2. Build a Chinese dependency tree corpus;
[0041] Step2.1. Perform Chinese sentence segmentation processing on the Chinese-Vietnamese word alignment parallel sentence pair library; ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com