Old-Chinese bilingual corpus construction method and device with Thai language as pivot
A bilingual corpus and construction method technology, applied in the field of natural language processing, can solve problems such as difficulty in obtaining parallel resources of old-Chinese bilinguals, resource scarcity, etc.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0060] Embodiment 1: as Figure 1-6 As shown, an old-Chinese bilingual corpus construction method with Thai as the pivot, including the following steps:
[0061] Step1. Extract Thai sentences from the existing Chinese-Thai parallel corpus data and perform Thai word segmentation processing;
[0062] As a preferred solution of the present invention, the specific steps of the step Step1:
[0063] Step1.1. Select Thai sentences with 20-50 characters from the existing Chinese-Thai bilingual parallel corpus;
[0064] Step1.2. For the selected Thai sentences, you can use the language information processing platform for small Southeast Asian languages developed by Kunming University of Science and Technology. The website is http: / / 222.197.219.24:8099 / for word segmentation processing.
[0065] The present invention considers that the Thai language adopts the form of consecutive scripts, and there is no word segmentation, so word-based translation and use in the model cannot be per...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com