Construction method of power grid equipment word segmentation dictionary and fault case database
A power grid equipment and word segmentation dictionary technology, applied in the direction of neural learning methods, neural architecture, semantic tool creation, etc., can solve problems such as insufficient mining of related information, insufficient support for maintenance decision-making, and low efficiency of retrieval and browsing, so as to facilitate intuitive understanding and improve The effect of application value and improving the accuracy of word segmentation
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0054] The power grid equipment failure defect case text contains a large number of specialized terms, which are usually not included in the existing general word segmentation tools' lexicon. If a general word segmentation tool is used to segment the text in the power grid field, a large number of professional terms will be misclassified, which will affect the reliability of subsequent word vector training and text classification. Therefore, before word segmentation, expanding domain-specific words on the public domain dictionary of mature word segmentation tools, and constructing a word segmentation dictionary in the power grid field are crucial to improving the accuracy of subsequent steps.
[0055] Methods A semi-supervised method combining automatic labeling based on named entity recognition model and manual manual screening was used to construct a power grid domain dictionary. The process is shown in the appendix. figure 2 . Solving the professional compliance of the id...
Embodiment 2
[0067] Further, in step b), before performing text information extraction, extract pictures, file names, author information, filter labels and typo noise, import the extracted and filtered text into the word segmentation tool of the power grid dictionary for word segmentation, and complete the text preprocessing work. . The actual fault defect cases handled are usually written manually, and are rich text files including tables, pictures, texts and labels, such as pdf, word and other formats. Before extracting text information, information such as stored pictures, file names, and authors should be extracted, and noise such as labels and typos should be filtered. The processed text is accurately segmented in the word segmentation tool imported into the word segmentation dictionary in the power grid field, and the text preprocessing is completed.
Embodiment 3
[0069] Further, step c) comprises the steps:
[0070] c-1) The purpose of information extraction of power grid equipment fault text data is to extract meaningful information for the description of power grid equipment faults and defects through the analysis and processing of unstructured text data, and to form structured data, which is convenient for certain future targets. accurate retrieval of content information. Considering the diversity of power grid fault text descriptions, a unified attribute template is used to extract attributes from text data. The attribute types are divided into digital state attribute, phrase state attribute and sentence state attribute. The state quantity attribute is to be extracted by a rule-based method, the phrase type state quantity attribute is to be extracted by the entity matching method based on grammar rules, and the sentence type state quantity attribute is to be classified by distributed text representation and neural network model. I...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com