Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

34 results about "Categorical variable" patented technology

In statistics, a categorical variable is a variable that can take on one of a limited, and usually fixed number of possible values, assigning each individual or other unit of observation to a particular group or nominal category on the basis of some qualitative property. In computer science and some branches of mathematics, categorical variables are referred to as enumerations or enumerated types. Commonly (though not in this article), each of the possible values of a categorical variable is referred to as a level. The probability distribution associated with a random categorical variable is called a categorical distribution.

Method and system for analyzing data and creating predictive models

InactiveUS20060161403A1Robust and accurate data modelFacilitate model interpretation modelMathematical modelsAnalogue computers for electric apparatusData setAnalysis data
A method and system of automatically analyzing data, cleansing and normalizing the data, identifying categorical variables within the data set, eliminating co-linearities among the variables and automatically building a statistical model is provided.
Owner:JIANG ERIC P +6

Method, system and computer program product for visually approximating scattered data using color to represent values of a categorical variable

A method, system, and computer program product for a new data visualization tool for determining distribution weights that represent values of a categorical variable and then mapping a distinct color to each of the weights so as to visually represent the different values of the categorical variable (or data attribute) in a scatter plot. The distinct colors of a splat are based on the distribution of categorical variable values in a corresponding bin, the distribution of which is represented by a vector. The vector contains as many locations as the number of different values for the categorical variable. The value stored in each location is typically a weight or percentage for that particular value of the categorical variable. Each location in the vector is also associated with a distinct color. The coloring of a single splat with multiple colors involves the rendering of each vector by looping through each vector location, and then based on the weight stored in that location, randomly selecting the same percentage of triangles in the splat for the color associated with that vector location. A threshold is used to help reduce confusion and decrease processing time by summing all weights below the threshold and assigning to it a single neutral color. A slider or other controller can be used to vary the value of the threshold.
Owner:MORGAN STANLEY +1

Valuation of properties bordering specified geographic features

Modeling comparable properties and rendering map images with automatic valuation of properties bordering specified geographic features. A valuation model identifies and accounts for the proximity of properties to geographic features. For example, estimating property value includes accessing property data corresponding to a geographic area and performing a regression based upon the property data. The regression models the relationship between price and explanatory variables, with the explanatory variables including proximity to geographic features. Proximity may be a categorical variable wherein properties bordering the geographic feature are determined to possess the proximity characteristic. Alternative explanatory variables may incorporate different degrees of proximity.
Owner:FANNIE MAE

Method for detecting urea-doped milk based on synchronous-asynchronous two-dimensional near-infrared related spectra

The invention relates to a method for detecting urea-doped milk based on synchronous-asynchronous two-dimensional near-infrared related spectra. The method comprises the following steps: 1, preparing pure milk for experiments and urea-doped milk; 2, respectively scanning the near-infrared spectra of the pure milk for experiments and the urea-doped milk; 3, calculating to obtain the normalization synchronous-asynchronous two-dimensional near-infrared related spectrum matrix of the pure milk for experiments and the normalization synchronous-asynchronous two-dimensional near-infrared related spectrum matrix of the urea-doped milk; 4, building a discrimination model with a categorical variable matrix by a multi-dimensional partial least squares; 5, scanning and calculating unknown sample milk to obtain the synchronous-asynchronous two-dimensional near-infrared related spectrum matrix of the unknown sample milk, and substituting into the discrimination model to obtain whether urea is doped or not. Similarity and difference information of a to-be-analyzed system, changing with external interference, are fully utilized, and the influence of the single adoption of synchronous spectrum or asynchronous spectrum matrix redundant information on the model is overcome. The method is simple and scientific, and the analysis efficiency and the discriminating accuracy are high.
Owner:天津市浓昇农业科技有限公司

Mobile terminal and user interface management method thereof

The embodiment of the invention provides a mobile terminal which comprises a variable judgment module, a user class recognition module, an interface style management module and a display module, wherein the variable judgment module is used for judging whether a user class variable is valid; the user class recognition module is used for providing a user class setting interface through which a usercan input user class information, and judging the class of the user in accordance with the user class information; the interface style management module is used for providing interface parameters corresponding to the class of the user in accordance with the class of the user; and the display module is used for displaying a user interface in accordance with the interface parameters provided by theinterface style management module. The embodiment of the invention also provides a mobile terminal user interface management method. By using the mobile terminal and the user interface management method thereof, the user can be automatically switched to the style and settings matched with the user group in accordance with the user class information, thereby achieving better user experience.
Owner:SUZHOU XINENG ENVIRONMENTAL SCI & TECH CO LTD +1

Power transformer defect information data mining method

ActiveCN105843210AReasonable and effective maintenance strategyEliminate omissionsElectric testing/monitoringData dredgingData set
The invention discloses a power transformer defect data mining method. The method includes the following steps that: defect attribute screening is performed on the historical defect data set D0 of a power transformer, so that a defect data set D1 can be formed; filling or deletion is performed on defect attributes in the D1, so that noise data can be decreased; new attributes are constructed based on existing attributes of the D1, discretization is performed on continuously-valued attributes, reasonable stratification is performed on categorical attributes, and therefore, a defect data set D2 can be formed; the correlation between input attributes and target attributes is calculated, uncorrelated attributes are deleted, the remaining attributes form a defect data set D3; the association relationships between the attributes of the defect data set are calculated by using an Apriori algorithm; and effective association rules are extracted, the defect factors of the power transformer are analyzed, an association rule knowledge base can be formed. With the power transformer defect data mining method of the invention adopted, the defects of the power transformer can be mined in a multi-dimensional and multi-level manner, the association relationships between the attributes can be extracted conveniently and fast, a basis can be provided for power transformer condition evaluation, and the accuracy of condition evaluation can be improved.
Owner:TSINGHUA UNIV +1

Small sample PolSAR image classification method based on fuzzy label semantic prior

ActiveCN110096994AEnsure consistencyAvoid the problem of increased calculationScene recognitionNeural architecturesSmall sampleAlgorithm
The invention discloses a small sample PolSAR image classification method based on fuzzy label semantic prior. The method comprises the steps of preparing a PolSAR image to be classified; obtaining real polarization characteristics as input data of the network; obtaining a sampling matrix for recording the position of the training sample and a sampling label matrix for recording pixel label information at the corresponding position; utilizing the sampling label matrix to initialize and classify to build a full convolutional network FCN; sending the real-number input data, the sampling matrix,the sampling label matrix and the classification matrix to a built full convolutional network FCN for training; updating the classification matrix by utilizing the prediction result of the FCN, the sampling matrix, the sampling label matrix and the current state of the classification matrix; repeating the operation until the maximum number of iterations is met; outputting the final classificationmatrix; and calculating classification accuracy and a classification result graph to complete image classification. According to the invention, alternate iteration training is carried out on the deepfull convolution network parameters and the label category variables, and the problem of low PolSAR classification precision under a small sample problem is solved.
Owner:XIDIAN UNIV

Data mining using variable rankings and enhanced visualization methods

Dimensional data with attributed categorical variables is mined against a continuous target with any data mining method by ranking variables. The ranked variables are used to generate a tree. A population and a target value, obtained from a top node of the tree, are stored. The top node is removed from the tree to create a new tree with a next top node. Obtaining and storing a next population and a next target value for the next top node, and removing the top node or top field to create a new tree, are repeated. The listing of sequential top node parameters is plotted on a tree cusp curve that provides a graphical user interface enabling identification of a field which affect a greatest or a least number of records, based upon a magnitude of departure of the field from a norm.
Owner:WRP IP MANAGEMENT LLC

Property value estimation with categorical location variable providing neighborhood proxy

Model-based property value estimation with that implements a categorical location variable providing a neighborhood proxy. A regression models the relationship between sale price and a set of explanatory variables. These explanatory variables include a location variable that is defined at a level of granularity such that the location variable acts as a proxy for location within a neighborhood. An infill process remedies the effects of insufficient amounts of location variable data.
Owner:FANNIE MAE

Nominal attribute-based continuous type feature construction method

InactiveCN106897776AStrong scalabilitySuitable for parallelizationMachine learningFeature extractionData set
The invention discloses a nominal attribute-based continuous type feature construction method. The method comprises the following steps of: 1) performing data preprocessing; 2) setting a feature construction frame according to business background knowledge; 3) generating concrete a feature construction path; 4) constructing corresponding features according to the feature construction path and generating a training set; (5) performing feature selection on the training set and constructing a prediction model; (6) saving the relevant data set and the prediction model and terminating an off-line training process; 7) performing preprocessing and feature extraction on sample data required to be subjected to on-line prediction; and 8) using a prediction model obtained through the off-line training to predict a sample. The nominal attribute-based continuous type feature construction method of the invention cannot only be applied to a user-item scene and but also be applied to more general classification and regression prediction problems with nominal attributes or categorical variable features. Compared with traditional One-Hot and Dummy coding, the features generated by using the method of the invention make the differences of samples more obvious and have strong interpretability.
Owner:SOUTH CHINA UNIV OF TECH

Method of pooling samples for performing a bi0l0gical assay

The present invention relates to a method of pooling samples to be analyzed for a categorical variable, wherein the analysis involves a quantitative measurement of an analyte, said method of pooling samples comprising providing a pool of n samples wherein the amount of individual samples in the pool is such that the analytes in the samples are present in a molar ratio of x0 : x1 : x2 : x(n-1), and wherein x is an integer of 2 or higher representing the number of classes of the categorical variable.
Owner:HENDRIX GENETICS

Sparse higher-order markov random field

Systems and methods are provided for identifying combinatorial feature interactions, including capturing statistical dependencies between categorical variables, with the statistical dependencies being stored in a computer readable storage medium. A model is selected based on the statistical dependencies using a neighborhood estimation strategy, with the neighborhood estimation strategy including generating sets of arbitrarily high-order feature interactions using at least one rule forest and optimizing one or more likelihood functions. A damped mean-field approach is applied to the model to obtain parameters of a Markov random field (MRF); a sparse high-order semi-restricted MRF is produced by adding a hidden layer to the MRF; indirect long-range dependencies between feature groups are modeled using the sparse high-order semi-restricted MRF; and a combinatorial dependency structure between variables is output.
Owner:NEC CORP

A default user probability prediction method based on sparse feature embedding

The invention discloses a default user probability prediction method based on sparse feature embedding. The method comprises the following steps: firstly, converting original data of a user into variable features, and then mapping multi-class variables in the variable features into a sparse matrix (similar to one-hot processing); And on the basis, mapping the sparse matrix to a probability througha basic decision tree model, and adding the probability as a feature to the model to predict a default user. Compared with the prior art, the default user probability prediction method based on sparse feature embedding has the advantages that the processing capacity of category coding is effectively improved, meanwhile, the dimension of feature space is effectively reduced in the subsequent machine learning process, and learning and processing of a machine learning model are facilitated.
Owner:华融融通(北京)科技有限公司

Unordered categorical variable processing method and device

The invention provides an unordered categorical variable processing method and device. The method comprises the following steps of: obtaining an unordered categorical variable set, wherein the unordered categorical variable set comprises at least two categories of unordered categorical variables and corresponding dependent variables are binary variables; aiming at each category of unordered categorical variables in the unordered categorical variable set, carrying out statistical analysis on a categorical proportion, in the category of unordered categorical variables, of an unordered categorical variable, the dependent variable values of which is a target categorical value in the binary variables; and clustering the unordered categorical variable set on the basis of the categorical proportion of each category of unordered categorical variables so as to obtain a plurality of unordered categorical variable subsets, wherein each unordered categorical variable subset comprises at least onecategory of unordered categorical variables and each unordered categorical variable subset corresponds to an ordered categorical variable. According to the method and device, grouping can be realizedwithout participation of human experiences, so that the grouped processing efficiency is relatively high and the objectivity and correctness of grouping results are further enhanced.
Owner:GUOXIN YOUE DATA CO LTD

Method of performing a biological assay

The present invention relates to a method of pooling samples to be analyzed for a categorical variable, wherein the analysis involves a quantitative measurement of an analyte, said method of pooling samples comprising providing a pool of n samples wherein the amount of individual samples in the pool is such that the analytes in the samples are present in a molar ratio of x0:x1:x2:x(n−1), and wherein x is equal to a positive value other than 1 representing the pooling factor.
Owner:HENDRIX GENETICS

Method and apparatus for nondestructive grouping of unordered categorical variable information

InactiveCN106096224APreserve the ability to distinguishThe grouping process is simpleInformaticsSpecial data processing applicationsComputer scienceNondestructive testing
The invention discloses a method and an apparatus for nondestructive grouping of unordered categorical variable information. The method comprises the steps of calculating an evidence weight value for the value of each category in unordered categorical variables under the supervision of a two-value target variable; and performing isobathic grouping on the evidence weight values, dividing the evidence weight values into M regions, and taking the M regions as groups of the unordered categorical variables. According to the method and the apparatus for the nondestructive grouping of the unordered categorical variable information, disclosed by the invention, the grouping process is simple and easy to understand, the calculation speed is high, and the distinguishing capability of the unordered categorical variables for the target variable can be well reserved.
Owner:深圳前海信息技术有限公司

Evaluation method for flavor quality of meat flavor

InactiveCN105842396AEvaluation results are stableThe evaluation results are consistentMaterial analysisEvaluation resultQualitative analysis
The invention provides an evaluation method for the flavor quality of meat flavor. The method evaluates the flavor quality of meat flavor simulating the flavor of mutton, beef, pork, chicken, duck or the like on line by using a partial least square-determination analysis process. The method comprises the following steps: screening of an independent variable; qualitative analysis and quantitative analysis of the independent variable; establishing of independent variable pattern information; endowment of samples of a training set with categorical variable values; etc. The evaluation method for the flavor quality of meat flavor in the invention can carry out on-line detection to obtain results. A conventional sensory evaluation method for meat flavor has the disadvantage of poor stability of evaluation results since evaluation personnel are different in physiological and psychological senses and standards for selected sensory attributes and evaluation bases are not unified. However, the evaluation method for the flavor quality of meat flavor in the invention enables evaluation results of meat flavor to be consistent and stable.
Owner:BEIJING TECHNOLOGY AND BUSINESS UNIVERSITY +2

Information processing apparatus, information processing method, and program

Provided is an information processing apparatus that tests independence between multiple variables, the information processing apparatus including: a discretization section that discretizes at least one numerical variable on the basis of at least one categorical variable, when the categorical variable and the numerical variable are included in at least two dependent variables in a graphical model and a set of conditional variables serving as conditions of independence between the two variables; and a test execution section that executes a test for conditional independence between the two variables by using the categorical variable and a discrete variable which is obtained by discretizing the numerical variable.
Owner:SONY CORP

System and method for data analysis and presentation of data

A system and method for data analysis and presentation of data are provided. The system for data analysis and presentation of data includes a memory configured to receive a plurality of data sets. The system also includes a processing subsystem operatively coupled to the memory and configured to determine a plurality of properties of the plurality of data sets, to analyse a categorical variable of the plurality of data sets based on the plurality of properties of the plurality of the plurality of data sets, to identify one or more custom rules based on an analysed categorical variable, to interpret the identified one or more custom rules, to identify a graph based on one or more custom rules, to identify one or more textual insights based on one or more custom rules and the identified graph, to present the identified graph and one or more textual insights.
Owner:MARLABS

Black box optimization over categorical variables

PendingUS20220284305A1Substantial beneficial technical effectImprove sampling efficiencyBiostatisticsInference methodsAlgorithmEngineering
A black box evaluator is accessed and a surrogate machine learning model that provides estimates for the optimization of categorical values for the black box evaluator is generated, the surrogate machine learning model being based upon observations from previous executions of the black box evaluator. The black box evaluator is optimized by selecting, by an acquisition function executing on a computing device, a new candidate point for the categorical values. The black box evaluator is executed with the new candidate point for the categorical values.
Owner:IBM CORP

Method and system for optimizing operator's mobile service resources

The invention discloses a method and a system for optimizing mobile service resources of an operator. The method includes: counting the historical dialing data of the operator's customers, and the dialing data is a continuous variable; converting the continuous variable into a discrete characteristic variable through chi-square analysis; taking whether the customer has opened a mobile service as a binary classification variable, establishing the characteristic variable and classification Variable C4.5 decision tree model, wherein, in the decision tree model, calculate the information gain rate corresponding to each segmentation, and select the segmentation threshold with the largest information gain rate as the optimal segmentation threshold for this attribute; according to the decision The tree model calculates the value of the classification variable to obtain the prediction result of whether the customer subscribes to the mobile service; optimizes the mobile service resources of the operator according to the prediction result. Through the technical solution provided by the invention, the customer's demand for mobile service can be efficiently obtained from the current customer's dialing behavior, so as to realize the optimized deployment of the operator's mobile service resources.
Owner:CHINA TELECOM CORP LTD

Standardizing and abstraction system of records measured by a plurality of physical quantities' measuring devices

Processing system for standardization and abstraction of registers measured by measuring devices (1), which comprise processing means of measured registers (5) received for generating processed registers (5a) storable in a storage database (6) of processed registers, characteristics schemes (7a) of separate models of measuring device (1), which comprise at least one module (9) and at least one submodule (10) assigned to the said module (9) on the basis of its functioning mode (17a), each module (9) being allotted to at least one memory position of a measuring device model (1), associated with a single measuring point and assigned to at least one category map (8a) in which a submodule (10) is related to a category variable, with assignment tables (16) of category variables, mapping means (11) with assignment means (12) and transformation means (13) provided for transforming values of each measured register (1a) into values of processed registers (5a), expressed in the pre-established equivalent unit of measurement assigned to the corresponding submodule (10). The assignment means (12) are designed for assigning to each measured register (1a) read in the memory position (18a) corresponding to a submodule (10), the category variable with which the submodule is related in the assignment table (16).
Owner:IBERIA TECH INTEGRATED SOLUTIONS S L U

Method and system for facilitating combining categorical and numerical variables in machine learning

One embodiment of the subject matter combines categorical and numerical variables in machine learning based on a difference table for categorical variables. During operation, the system performs the following steps. First, the system receives an input value of a categorical variable. Next, the system determines a prediction based on the input value of the categorical variable, a most likely value of the categorical variable, and a difference table for the categorical variable, where the most likely value of the categorical variable is based on a plurality of values of the categorical variable and where the difference table for the categorical variable comprises a number for each pair of values of the categorical variable. Subsequently, the system produces a result that indicates the prediction.
Owner:PRIEDITIS ARMAND

Data Mining Method for Power Transformer Defect Information

The invention discloses a power transformer defect data mining method. The method includes the following steps that: defect attribute screening is performed on the historical defect data set D0 of a power transformer, so that a defect data set D1 can be formed; filling or deletion is performed on defect attributes in the D1, so that noise data can be decreased; new attributes are constructed based on existing attributes of the D1, discretization is performed on continuously-valued attributes, reasonable stratification is performed on categorical attributes, and therefore, a defect data set D2 can be formed; the correlation between input attributes and target attributes is calculated, uncorrelated attributes are deleted, the remaining attributes form a defect data set D3; the association relationships between the attributes of the defect data set are calculated by using an Apriori algorithm; and effective association rules are extracted, the defect factors of the power transformer are analyzed, an association rule knowledge base can be formed. With the power transformer defect data mining method of the invention adopted, the defects of the power transformer can be mined in a multi-dimensional and multi-level manner, the association relationships between the attributes can be extracted conveniently and fast, a basis can be provided for power transformer condition evaluation, and the accuracy of condition evaluation can be improved.
Owner:TSINGHUA UNIV +1

A real-time prediction method for urban road traffic accident risk

InactiveCN104732075BIn line with traffic characteristicsImprove accuracyForecastingTraffic accidentTraffic flow
The invention provides a real-time prediction method of urban road traffic accident risk, which calculates by extracting the geometric linear data of each observation object in the observation set, the basic data of historical traffic flow n minutes before the occurrence of the traffic accident, and the historical weather condition data to obtain traffic The characteristic parameters of traffic flow n minutes before the accident and weather condition data are transformed into grades of categorical variables and the distribution probability of this grade, and then a real-time prediction model of urban road traffic accidents based on Poisson distribution is established, using the determined traffic flow characteristic parameters The level of weather condition data and the distribution probability of this level are used to calibrate the prediction model. When predicting the traffic accident risk of the required prediction object, it is only necessary to calculate the real-time traffic flow characteristic parameters and weather conditions of the required prediction object in real time. The level after the data is converted into a categorical variable and the distribution probability of the level can be used to predict the traffic accident risk of the desired prediction object using the calibrated formula.
Owner:SUN YAT SEN UNIV

Method for detection of milk mixed with urea based on synchronous-asynchronous two-dimensional near-infrared correlation spectroscopy

The invention relates to a method for detecting urea-doped milk based on synchronous-asynchronous two-dimensional near-infrared related spectra. The method comprises the following steps: 1, preparing pure milk for experiments and urea-doped milk; 2, respectively scanning the near-infrared spectra of the pure milk for experiments and the urea-doped milk; 3, calculating to obtain the normalization synchronous-asynchronous two-dimensional near-infrared related spectrum matrix of the pure milk for experiments and the normalization synchronous-asynchronous two-dimensional near-infrared related spectrum matrix of the urea-doped milk; 4, building a discrimination model with a categorical variable matrix by a multi-dimensional partial least squares; 5, scanning and calculating unknown sample milk to obtain the synchronous-asynchronous two-dimensional near-infrared related spectrum matrix of the unknown sample milk, and substituting into the discrimination model to obtain whether urea is doped or not. Similarity and difference information of a to-be-analyzed system, changing with external interference, are fully utilized, and the influence of the single adoption of synchronous spectrum or asynchronous spectrum matrix redundant information on the model is overcome. The method is simple and scientific, and the analysis efficiency and the discriminating accuracy are high.
Owner:天津市浓昇农业科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products