N sets of feature vectors are generated from a set of observation vectors which are indicative of a pattern which it is desired to recognize. At least one of the sets of feature vectors is different than at least one other of the sets of feature vectors, and is preselected for purposes of containing at least some complimentary information with regard to the at least one other set of feature vectors. The N sets of feature vectors are combined in a manner to obtain an optimized set of feature vectors which best represents the pattern. The combination is performed via one of a weighted likelihood combination scheme and a rank-based state-selection scheme; preferably, it is done in accordance with an equation set forth herein. In one aspect, a weighted likelihood combination can be employed, while in another aspect, rank-based state selection can be employed. An apparatus suitable for performing the method is described, and implementation in a computer program product is also contemplated. The invention is applicable to any type of pattern recognition problem where robustness is important, such as, for example, recognition of speech, handwriting or optical characters under challenging conditions.