Chinese text error detection method and system based on word order and semantic conjoint analysis
A joint analysis and text technology, applied in semantic analysis, natural language data processing, instruments, etc., can solve problems such as unsatisfactory weight distribution and inability to perform error detection well, and achieve anti-interference ability, increase quantity, The effect of deepening understanding
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0071] The specific embodiments of the present invention will be described in further detail below with reference to the accompanying drawings. Its specific process is described as follows figure 1 shown, where:
[0072] Step 1: Preprocess the input data obtained by the model.
[0073] The preprocessing process is divided into the following four steps:
[0074] 1-1 Create a dictionary. Perform word segmentation on all text sentences to construct a set of candidate Chinese characters The frequency of occurrence of each word is counted according to the set, and the words whose frequency is lower than 3 are filtered, and the filtered set is deduplicated to form a set of Chinese characters D(w). Insert some special symbols such as "START" starter, "END" terminator, "CLS" spacer, "UNKNOW" unknown character, "PAD" filler and so on into the Chinese character set D(w). These symbols help the computer to better fit the text. Then use the index to mark each word in the Chinese wo...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com