Heuristic breadth-first searching method for cancer-related genes

A breadth-first, tumor gene technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as algorithm infeasibility, dimension disaster, overfitting, etc., to achieve accurate diagnosis and individuality The effect of chemotherapy

Inactive Publication Date: 2013-07-03
HEFEI INSTITUTES OF PHYSICAL SCIENCE - CHINESE ACAD OF SCI
View PDF0 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the curse of dimensionality problem brought by the gene expression profile data set, selecting the smallest gene subset from thousands of genes implies two problems: overfitting phenomenon and selection bias
Although their method can obtain highly unbiased results, when the initial selection of genes is very large (for example, more than 300 genes), the computational cost of this method is too large to be feasible for the algorithm

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Heuristic breadth-first searching method for cancer-related genes
  • Heuristic breadth-first searching method for cancer-related genes
  • Heuristic breadth-first searching method for cancer-related genes

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] In order to better explain the technical solution, the present invention first formally describes the classification problem to be solved, introduces the search strategy of the HBSA algorithm, and further provides the implementation process of the HBSA algorithm. On this basis, an integrated classifier construction method based on HBSA and a gene sequencing method based on HBSA were designed to obtain unbiased prediction accuracy and discover important tumor-related genes. Experimental results have shown the feasibility and effectiveness of the technical solution of the present invention. The superiority of this method is shown by comparing with other related methods. Biomedical analysis of selected genes can be further justified in three aspects (function of individual genes, pathway analysis and protein network).

[0029] 1. Problem description

[0030] Let G={g 1 ,...,g n} represents a group of genes, set S={s 1 ,...,s m} represents a set of samples. Where |G|...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a heuristic breadth-first searching method for cancer-related genes. According to the method, appearance frequencies of genes in a selected gene subset are used for measuring the genes, and genes with higher appearance frequency are considered as the most important cancer-related genes, on the basis, a classifier is designed and a gene ordering method based on HBSA is established. As proved by study, information gene selection plays an important role in improving the classification performance, and the genes can be probably taken as important tumor clinical diagnosis signs, so discovery of the minimum gene subset with the highest classification performance is a very important research objective. As indicated by experimental results, the heuristic breadth-first searching method can not only obtain favorable generalization performance but also discover important tumor genes. And the relationship of the appearance frequencies of the selected genes and the gene number conforms to power-law distribution. The genes in the gene subset with extremely high classification accuracy are in close relationship with specific tumor subtypes, and even the genes are important genes directly related with the tumor.

Description

[0001] Field [0002] The present invention relates to technologies and theories such as tumor gene expression profile data collection, tumor-related gene importance selection, and machine learning, especially the system uses a heuristic breadth-first search method aimed at the characteristics of tumor gene expression profile sample sets to discover important tumor-related genes And classify tumor subtypes according to these important tumor-related genes, so the system belongs to the application field of pattern recognition in biomedicine. Background technique [0003] From the perspective of molecular biology, tumors are a type of complex genetic disease caused by abnormal expression of genes in cells due to DNA damage on certain chromosomes, resulting in uncontrolled cell growth, lack of differentiation and abnormal proliferation. Therefore, tumors are also a systemic biological Diseases, so far human beings still do not fully understand the mechanism of tumor development. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/24
Inventor 黄上峰王树林李雪玲赵俊邱萍王耀雄葛运建双丰朱旻
Owner HEFEI INSTITUTES OF PHYSICAL SCIENCE - CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products