Data mining techniques are provided which are effective and efficient for discovering useful information from an amorphous collection or
data set of records. For example, the present invention provides for the mining of data, e.g., of several or many records, to discover interesting associations between entries of qualitative text, and covariances between data of quantitative
numerical types, in records. Although not limited thereto, the invention has particular application and
advantage when the data is of a type such as clinical, pharmacogenomic, forensic, police and financial records, which are characterized by many varied entries, since the problem is then said to be one of “
high dimensionality” which has posed mathematical and technical difficulties for researchers. This is especially true when considering strong negative associations and negative
covariance, i.e., between items of data which may so rarely come together that their concurrence is never seen in any
record, yet the fact that this is not expected is of potential great interest.