Medical data similarity detection system and method based on bit string hash

A technology of bit string hashing and medical data, applied in the field of text similarity detection, can solve the problems of low precision, lack of semantic information, unsatisfactory effect, etc., and achieve the effect of improving the quality of distinction, improving the quality of detection, and the method is scientific and reasonable

Pending Publication Date: 2020-11-06
NORTHEAST DIANLI UNIVERSITY
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At present, text similarity retrieval methods can be mainly divided into two categories. One is the traditional method based on keyword matching. This method considers the similarity from the same part of the two texts, and uses the co-occurrence and repetition degree of strings as the similarity measure. Standard, this kind of method can only compare the text from the literal level, without considering the semantic information of the text, so the effect is not ideal; the other kind is to use the spatial similarity model to convert the text features into the form of vector, this kind of method is in The text similarity calculation in the general field has a good effect, but the accuracy is not high in the text similarity calculation in the vertical subdivision field. The existing Chinese medical text similarity calculation methods generally have missing semantic information. For Chinese medical text The similarity calculation is inaccurate and cannot accurately reflect the similarity between medical texts

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Medical data similarity detection system and method based on bit string hash
  • Medical data similarity detection system and method based on bit string hash

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] In the following detailed description, descriptions will be made in conjunction with the accompanying drawings that are a part of the specification. In the drawings, the same / similar symbols generally denote the same / similar components, unless otherwise specified in the specification. The illustrative embodiments described in the detailed description, drawings, and claims should not be considered as limitations on the application. Other embodiments of the application may be employed, and other changes may be made, without departing from the spirit or scope of the subject matter presented herein. It should be readily understood that various configurations, substitutions, combinations, and designs of different configurations can be made to the various aspects of the application generally described in this specification and illustrated in the accompanying drawings, and all these changes are obviously expected. and form part of this application.

[0039] The present inven...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of text similarity detection, in particular to a medical data similarity detection system and method based on bit string hash. The system comprises a data storage module, a data preprocessing module, a text feature extraction module, a hash processing module, a text similarity calculation module and a similarity visualization module, wherein the data storage module is used for storing medical text data; the data preprocessing module performs dimension reduction processing on a text and removes privacy information; the feature extraction module is used for forming a document-feature matrix from documents, document features and weights thereof; the hash processing module is used for hashing texts; the similarity calculation module is used for mapping the documents into digital fingerprints, calculating Hamming distances and dividing document similarity groups; and the similarity visualization module is used for visually displaying medical texts. According to the method, the text is subjected to hash processing, so a text set with relatively high similarity with a target text can be found in massive medical text data, and medical problem retrieval efficiency is improved.

Description

technical field [0001] The invention relates to the field of text similarity detection, in particular to a medical data similarity detection system and method based on bit string hashing. Background technique [0002] With the rapid development of online medical care, the accumulation of text data in the medical field is increasing day by day, and the potential value contained in it will be very effective in reducing the communication cost between doctors and patients, helping the medical community to refine operations and provide more targeted services. Medical texts have the characteristics of indistinct categories, obvious unstructured features, high discriminative weights for low-frequency words, and widespread information loss and inconsistency. How to accurately calculate the similarity between medical texts and quickly and accurately retrieve relevant medical information is an urgent problem to be solved. In order to solve the above problems, this paper proposes a me...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16H50/70G06F40/194G06F40/284G06K9/62
CPCG16H50/70G06F40/194G06F40/284G06F18/22
Inventor 周铁华王玲李建刘文强
Owner NORTHEAST DIANLI UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products