Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Chinese electronic medical record entity extraction method based on character and word information fusion

An electronic medical record, entity extraction technology, applied in neural learning methods, electrical digital data processing, instruments, etc., can solve the problem of insufficient utilization of Chinese word information, and achieve the effect of reducing manual extraction, improving recognition results, and accurate extraction.

Pending Publication Date: 2020-06-05
CENT SOUTH UNIV
View PDF1 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The technical problem to be solved by the present invention is to provide a Chinese electronic medical record entity extraction method based on word information fusion in view of the deficiencies in the prior art, so as to solve the problem of manual extraction of features and insufficient utilization of Chinese word information in existing methods problem, more accurate extraction of Chinese electronic medical records

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese electronic medical record entity extraction method based on character and word information fusion
  • Chinese electronic medical record entity extraction method based on character and word information fusion
  • Chinese electronic medical record entity extraction method based on character and word information fusion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] This embodiment is aimed at the extraction of Chinese electronic medical record entities, extracting two types of entities, body parts and medical discoveries. The overall implementation process is as follows figure 1 As shown, this example uses the sentence "upper abdominal pain, no radiation" as an example to extract. The specific implementation steps are as follows, and the intermediate results of the specific example steps can be seen in Table 1:

[0044] Step 1: Use word-word joint training model (CWE) to train and obtain Chinese word vectors and word vectors, and obtain word vector table and word vector table. In this example, the length of the word vector and the word vector are both set to 100, the schematic diagram is as follows figure 2 shown.

[0045] Step 2: Preprocess the text corpus, divide the document into sentences, and use the word vector table in step 1 to retrieve the corresponding word vectors; in this example, divide the sentence "upper abdomina...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a Chinese electronic medical record entity extraction method based on character and word information fusion, and the method comprises the steps: obtaining a Chinese word vectorand a character vector in a Chinese electronic medical record, and obtaining a word vector table and a character vector table; segmenting a document into sentences and characters, and looking up thecharacter vector table for corresponding character vectors; obtaining a context word of each character and carrying out feature extraction on a word vector of the word to obtain context word information of each character; obtaining attention weight, and combining the weight with the character vector of each character and the context word information of the corresponding character to obtain new character information representation; extracting sequence information of each character; and labeling and identifying the sequence information by using a conditional random field to obtain a BIO categoryof each character in the sentence, and decoding an identification result to obtain a specific medical entity. According to the method, manual extraction is reduced, Chinese character and word information can be automatically learned and combined, character and word features are extracted, and the Chinese medical named entity identification result is effectively improved.

Description

technical field [0001] The invention belongs to the field of natural language processing, and in particular relates to a Chinese electronic medical record entity extraction method based on word information fusion. Background technique [0002] Electronic medical records (EMR) are text records of medical activities. The development of information technology has promoted the development of medical electronic medical records in our country. At the same time, the value of medical information is becoming more and more important. Extracting information from clinical notes is very difficult due to their unstructured nature. Therefore, how to effectively extract the patient's disease information is the primary requirement for studying the cause and development of patients. EMR entity extraction is the basis for studying patient diseases. It has a wide range of application scenarios, such as medical information retrieval, question answering system, clinical decision support, etc....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16H10/60G06F40/289G06F40/284G06N3/04G06N3/08
CPCG16H10/60G06N3/08G06N3/044G06N3/045
Inventor 高琰王艳东唐琎
Owner CENT SOUTH UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products