Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Machine learning methods and systems for identifying patterns in data using a plurality of learning machines wherein the learning machine that optimizes a performance function is selected

a learning machine and learning system technology, applied in the field of machine learning methods and systems, can solve the problems of limited number of methods available for searching across diverse, difficult implementation of search methods, and often inability of researchers to adequately query data, etc., to achieve different reduction of dimensionality, maximize performance function dependent effect, and reduce dimensionality

Active Publication Date: 2013-02-26
DIGITAL INFUZION
View PDF7 Cites 74 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

This patent describes methods using machine learning to identify patterns in data. The methods involve training multiple learning machines to identify which patterns are in the data samples. The learning machines are selected based on a performance function that maximizes the difference between different data groups, validates the training with cross-validation, and selects the most important variables. The trained learning machines can then be used to query data databases to identify patterns of data. The technical effect of this patent is to provide efficient and accurate methods for identifying patterns in data using machine learning techniques.

Problems solved by technology

Researchers are often unable to adequately query this data because they lack proper tools.
Currently, only a limited number of methods are available for searching across diverse gene expression experiments.
These experiments may include, for example, a study of diseased vs. normal tissue, the exposure of tissue culture cells to chemical compounds, or relationships between expression patterns in diseased cells and normal expression in other types of cells and search methods may be difficult to implement, even where these experiments were conducted using the same type of array format.
While SVMs and other machine learning tools have gained popularity in recent years, the methods remain primarily used for microarray analysis and classification, and have not been fully developed to optimize trained machines for searching and querying.
Current searching methods still rely largely on annotations, descriptive information, or values ranges for specific fields, and large amounts of data are therefore often not being utilized.
In particular, the choice of which trained machine is most likely to generalize well to data of unknown category is often a difficult one.
Also, a determination of which features in the data are the most important, and how many and which should be used in training is normally a complex problem that is challenging even to those well skilled in the art.
These systems and methods provide complex data processing, but rely upon inefficient techniques, such as expansion of training data and selection of optimum machines based on test data sets.
The ability of a learning machine to discover knowledge from data is limited by the type of algorithm selected.
Methods are also lacking that allow for the creation of hypothetical patterns and searching data to see if such patterns, or similar patterns, exist in actuality.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Machine learning methods and systems for identifying patterns in data using a plurality of learning machines wherein the learning machine that optimizes a performance function is selected
  • Machine learning methods and systems for identifying patterns in data using a plurality of learning machines wherein the learning machine that optimizes a performance function is selected
  • Machine learning methods and systems for identifying patterns in data using a plurality of learning machines wherein the learning machine that optimizes a performance function is selected

Examples

Experimental program
Comparison scheme
Effect test

examples

[0211]Exemplary implementations in accordance with aspects of the invention will now be further described with respect to the following non-limiting examples.

[0212]The following data have been obtained by working with a prototype feature reduction method, and two published data sets. These experiments demonstrate that the feature reduction method matches or exceeds any other method available for analysis.

[0213]1. Feature Selection for AML / ALL

[0214]Using the gradient to assess feature importance is a wrapper and takes into account joint information to identify the best groups of genes has been developed in accordance with the methods of the invention, and has been incorporated into a query engine. In order to compare this method to current technology, such as the Weka package, the AML / ALL data set was obtained. This data was processed and the results compared to other published methods for the same set.

[0215]Training data had 38 samples, 11 AML and 27 ALL, obtained from bone marrow. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Methods for training machines to categorize data, and / or recognize patterns in data, and machines and systems so trained. More specifically, variations of the invention relates to methods for training machines that include providing one or more training data samples encompassing one or more data classes, identifying patterns in the one or more training data samples, providing one or more data samples representing one or more unknown classes of data, identifying patterns in the one or more of the data samples of unknown class(es), and predicting one or more classes to which the data samples of unknown class(es) belong by comparing patterns identified in said one or more data samples of unknown class with patterns identified in said one or more training data samples. Also provided are tools, systems, and devices, such as support vector machines (SVMs) and other methods and features, software implementing the methods and features, and computers or other processing devices incorporating and / or running the software, where the methods and features, software, and processors utilize specialized methods to analyze data.

Description

RELATED APPLICATION DATA[0001]This application claims priority under 35 U.S.C. §119 to U.S. Provisional Application No. 61 / 095,731 filed on Sep. 10, 2008, the contents of which are incorporated herein by reference in their entirety.BACKGROUND OF THE INVENTION[0002]1. Field of the Invention[0003]Aspects of present invention generally relate to methods and systems for training machines to categorize data, and / or recognize patterns in data, and to machines and systems relating thereto. More specifically, exemplary aspects of, the invention relate to methods and systems for training machines that include providing one or more training data samples encompassing one or more data classes, identifying patterns in the one or more training data samples, providing one or more data samples representing one or more unknown classes of data, identifying patterns in the one or more of the data samples of unknown class(es), and predicting one or more classes to which the data samples of unknown clas...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(United States)
IPC IPC(8): G06F15/18G06N20/10
CPCG06N99/005G06N20/00G06N20/10G06F16/953G06N3/02
Inventor VIRKAR, HEMANTSTARK, KARENBORGMAN, JACOB
Owner DIGITAL INFUZION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products