Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

479 results about "Vector quantization" patented technology

Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by dividing a large set of points (vectors) into groups having approximately the same number of points closest to them. Each group is represented by its centroid point, as in k-means and some other clustering algorithms.

Face feature analysis for automatic lipreading and character animation

A face feature analysis which begins by generating multiple face feature candidates, e.g., eyes and nose positions, using an isolated frame face analysis. Then, a nostril tracking window is defined around a nose candidate and tests are applied to the pixels therein based on percentages of skin color area pixels and nostril area pixels to determine whether the nose candidate represents an actual nose. Once actual nostrils are identified, size, separation and contiguity of the actual nostrils is determined by projecting the nostril pixels within the nostril tracking window. A mouth window is defined around the mouth region and mouth detail analysis is then applied to the pixels within the mouth window to identify inner mouth and teeth pixels and therefrom generate an inner mouth contour. The nostril position and inner mouth contour are used to generate a synthetic model head. A direct comparison is made between the inner mouth contour generated and that of a synthetic model head and the synthetic model head is adjusted accordingly. Vector quantization algorithms may be used to develop a codebook of face model parameters to improve processing efficiency. The face feature analysis is suitable regardless of noise, illumination variations, head tilt, scale variations and nostril shape.
Owner:ALCATEL-LUCENT USA INC

Method and apparatus for automatic speech segmentation into phoneme-like units for use in speech processing applications, and based on segmentation into broad phonetic classes, sequence-constrained vector quantization and hidden-markov-models

InactiveUS6208967B1Straightforward and inexpensiveDigital computer detailsBiological modelsAutomatic speech segmentationSpoken language
For machine segmenting of speech, first utterances from a database of known spoken words are classified and segmented into three broad phonetic classes (BPC) voiced, unvoiced, and silence. Next, using preliminary segmentation positions as anchor points, sequence-constrained vector quantization is used for further segmentation into phoneme-like units. Finally, exact tuning to the segmented phonemes is done through Hidden-Markov Modelling and after training a diphone set is composed for further usage.
Owner:U S PHILIPS CORP

Speaker verification system

A text-independent speaker verification system utilizes mel frequency cepstral coefficients analysis in the feature extraction blocks, template modeling with vector quantization in the pattern matching blocks, an adaptive threshold and an adaptive decision verdict and is implemented in a stand-alone device using less powerful microprocessors and smaller data storage devices than used by comparable systems of the prior art.
Owner:SAUDI ARABIAN OIL CO

Recursive and trellis-based feedback reduction for MIMO-OFDM with rate-limited feedback

Techniques are provided for reducing feedback while maintaining performance in a MIMO-OFDM system. The disclosed techniques employ finite-rate feedback methods that uses vector quantization compression. The disclosed methods / techniques generally involve: receiving a plurality of symbols from a plurality of sub-carriers at a receiver; selecting a plurality of indices of codewords corresponding to a codebook of pre-coding weighting matrices for the sub-carriers based on vector quantization compression of the codewords; and transmitting the selected indices over a wireless channel to a transmitter. Finite state vector quantization feedback makes use of a finite state vector quantizer (FSVQ), which is a recursive vector quantizer (VQ) with a finite number of states. In finite state vector quantization feedback, optimal precoding matrices (beamforming vectors) are selected sequentially across subcarriers. In a trellis-based feedback method, the optimal precoding matrices are selected at the same time for all subcarriers by searching for the optimum choice of matrices along a trellis using the Viterbi algorithm (dynamic programming).
Owner:UNIV OF CONNECTICUT

Object recognizer and detector for two-dimensional images using bayesian network based classifier

A system and method for determining a classifier to discriminate between two classes—object or non-object. The classifier may be used by an object detection program to detect presence of a 3D object in a 2D image (e.g., a photograph or an X-ray image). The overall classifier is constructed of a sequence of classifiers (or “sub-classifiers”), where each such classifier is based on a ratio of two graphical probability models (e.g., Bayesian networks). A discrete-valued variable representation at each node in a Bayesian network by a two-stage process of tree-structured vector quantization is discussed. The overall classifier may be part of an object detector program that is trained to automatically detect many different types of 3D objects (e.g., human faces, airplanes, cars, etc.). Computationally efficient statistical methods to evaluate overall classifiers are disclosed. The Bayesian network-based classifier may also be used to determine if two observations (e.g., two images) belong to the same category. For example, in case of face recognition, the classifier may determine whether two photographs are of the same person. A method to provide lighting correction or adjustment to compensate for differences in various lighting conditions of input images is disclosed as well. As per the rules governing abstracts, the content of this abstract should not be used to construe the claims in this application.
Owner:CARNEGIE MELLON UNIV

Systems and methods for image pattern recognition

Systems and methods for image pattern recognition comprise digital image capture and encoding using vector quantization (“VQ”) of the image. A vocabulary of vectors is built by segmenting images into kernels and creating vectors corresponding to each kernel. Images are encoded by creating a vector index file having indices that point to the vectors stored in the vocabulary. The vector index file can be used to reconstruct an image by looking up vectors stored in the vocabulary. Pattern recognition of candidate regions of images can be accomplished by correlating image vectors to a pre-trained vocabulary of vector sets comprising vectors that correlate with particular image characteristics. In virtual microscopy, the systems and methods are suitable for rare-event finding, such as detection of micrometastasis clusters, tissue identification, such as locating regions of analysis for immunohistochemical assays, and rapid screening of tissue samples, such as histology sections arranged as tissue microarrays (TMAs).
Owner:LEICA BIOSYST IMAGING

Prototype waveform phase modeling for a frequency domain interpolative speech codec system

A system and method is provided that employs a frequency domain interpolative CODEC system for low bit rate coding of speech which comprises a linear prediction (LP) front end adapted to process an input signal that provides LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual signal. An open loop pitch estimator adapted to process the LP residual signal, a pitch quantizer, and a pitch interpolator and provide a pitch contour within the predetermined intervals is also provided. Also provided is a signal processor responsive to the LP residual signal and the pitch contour and adapted to perform the following: provide a voicing measure, where the voicing measure characterizes a degree of voicing of the input speech signal and is derived from several input parameters that are correlated to degrees of periodicity of the signal over the predetermined intervals; extract a prototype waveform (PW) from the LP residual and the open loop pitch contour for a number of equal sub-intervals within the predetermined intervals; normalize the PW by a gain value of the PW; encode a magnitude of the PW; and separate stationary and nonstationary components of the PW using a low complexity alignment process and a filtering process that introduce no delay. The ratio of the energy of the nonstationary component of the PW to that of the stationary component of the PW is averaged across 5 subbands to compute the nonstationarity measure as a frequency dependent vector entity. A measure of the degree of voicing of the residual is also computed using openloop pitchgain, pitch variance, relative signal power, PW correlation and PW nonstationarity in low frequency subbands. The nonstationarity measure and voicing measure are encoded using a 6-bit spectrally weighted vector quantization scheme using a codebook partitioned based on a voiced / unvoiced decision. At the decoder, a stationary component of PW is reconstructed as a weighted combination of the previous PW phase vector, a random phase perturbation and a fixed phase vector obtained from a voiced pitch pulse.
Owner:HUGHES NETWORK SYST

Inverse scan, coefficient, inverse quantization and inverse transform system and method

Presented herein are inverse scan, coefficient prediction, inverse quantization and inverse transform system(s) and method(s). In one embodiment, there is presented a system for converting scanned quantized frequency coefficients to pixel domain data. The system comprises a first circuit and a second circuit. The first circuit converts scanned quantized frequency coefficients to pixel domain data, wherein the scanned quantized frequency coefficients encode video data in accordance with a first encoding standard. The second circuit converts scanned quantized frequency coefficients to pixel domain data, wherein the scanned quantized frequency coefficients encode video data in accordance with a second encoding standard.
Owner:AVAGO TECH WIRELESS IP SINGAPORE PTE

Method And System For Semantically Segmenting Scenes Of A Video Sequence

A shot-based video content analysis method and system is described for providing automatic recognition of logical story units (LSUs). The method employs vector quantization (VQ) to represent the visual content of a shot, following which a shot clustering algorithm is employed together with automatic determination of merging and splitting events. The method provides an automated way of performing the time-consuming and laborious process of organising and indexing increasingly large video databases such that they can be easily browsed and searched using natural query structures.
Owner:BRITISH TELECOMM PLC

Vector quantization with a non-structured codebook for audio compression

According to one embodiment of the invention, a multistage vector list quantizer comprises a first stage quantizer to select candidate first stage codewords from a plurality of first stage codewords, a reference table memory storing a set of second stage codewords for each first stage codeword, and a second stage codebook constructor to generate a reduced complexity second stage codebook that is the union of sets corresponding to the candidate first stage codewords selected by the first stage quantizer.
Owner:XVD TECH HLDG LTD IRELAND

Method and device of multi-resolution vector quantilization for audio encoding and decoding

The present invention provides a method and device of multi-resolution vector quantization (VQ) for audio encoding and decoding used to analyze the audio signal in multi-resolution and quantize the vectors of them. Said method for encoding audio comprises the steps of: adaptively filtering a input audio signal so as to gain a time-frequency filter coefficient and output a filtered signal; dividing vectors of the filtered signal in a time-frequency plane so as to gain a vector combination; selecting the vector to be quantized; quantizing the selected vectors and calculating a quantization residual error; and transmitting a quantized coding task information as a side-information of an encoder to an audio decoder to quantize and encode the quantization residual error. The invention can adaptively filter the audio signal, and adjust the resolutions of time and frequency. The hereinafter result of multi-resolution time-frequency analysis can be utilized effectively through reorganizing the filter coefficient by selecting different organizing policies. VQ may improve encoding efficiency as well as control quantizing precision simply and optimize it.
Owner:BEIJING E WORLD TECH

Image processing device and image processing method

An image processing device including an acquiring section configured to acquire quantization matrix parameters from an encoded stream in which the quantization matrix parameters defining a quantization matrix are set within a parameter set which is different from a sequence parameter set and a picture parameter set, a setting section configured to set, based on the quantization matrix parameters acquired by the acquiring section, a quantization matrix which is used when inversely quantizing data decoded from the encoded stream, and an inverse quantization section configured to inversely quantize the data decoded from the encoded stream using the quantization matrix set by the setting section.
Owner:SONY CORP

Character recognition system and method

A system and method for translating a written document into a computer readable document by recognizing the character written on the document aim at recognizing typed or printed, especially hand-printed or handwritten characters, in the various fields of a form. Providing a pixel representation of the written document, the method allows translating a written document into a computer readable document by i) identifying at least one field into the pixel representation of the document; ii) segmenting each field so as to yield at least one segmented symbol; iii) applying a character recognition method on each segmented symbol; and iii) assigning a computer-readable code to each recognized character resulting from the character recognition method. The character recognition method includes doing a vector quantization on each segmented symbol, and doing a vector classification using a vector base. A learning base is also created based on the optimal elliptic separation method. System and method according to the present invention allow to achieve a substitution rate of near zero.
Owner:IMDS SOFTWARE

Fast optimal linear approximation of the images of variably illuminated solid objects for recognition

An efficient computation of low-dimensional linear subspaces that optimally contain the set of images that are generated by varying the illumination impinging on the surface of a three-dimensional object for many different relative positions of that object and the viewing camera. The matrix elements of the spatial covariance matrix for an object are calculated for an arbitrary pre-determined distribution of illumination conditions. The maximum complexity is reduced for the model by approximating any pair of normal-vector and albedo from the set of all such pairs of albedo and normals with the centers of the clusters that are the result of the vector quantization of this set. For an object, a viewpoint-independent covariance matrix whose complexity is large, but practical, is constructed and diagonalized off-line. A viewpoint-dependent covariance matrix is computed from the viewpoint-independent diagonalization results and is diagonalized online in real time.
Owner:NEC CORP

System and method for reduced codebook vector quantization

The present invention extends the generalized Lloyd algorithm (GLA) for vector quantizer (VQ) codebook improvement and codebook design to a new linearly-constrained generalized Lloyd algorithm (LCGLA). The LCGLA improves the quality of VQ codebooks, by forming the codebooks from linear combinations of a reduced set of base codevectors. The present invention enables a principled approach for compressing texture images in formats compatible with various industry standards. New, more flexible compressed texture image formats are also made possible with the present invention. The present invention enhances signal compression by improving traditional VQ approaches through the integrated application of linear constraints on the multiple pattern and signal prototypes that represent a single pattern or block of signal samples.
Owner:CISCO SYST CANADA

Method and system for voiceprint recognition based on vector quantization based

The invention discloses a method and a system for voiceprint recognition based on vector quantization, which have high recognition performance and noise immunity, are effective in recognition, require few modeling data, and are quick in judgment speed and low in complexity. The method includes steps: acquiring audio signals; preprocessing the audio signals; extracting audio signal characteristic parameters by using MFCC (mel-frequency cepstrum coefficient) parameters, wherein the order of the MFCC ranges from 12 to 16; template training, namely using the LBG (linde, buzo and gray) clustering algorithm to set up a codebook for each speaker and store the codebooks in an audio data base to be used as the audio templates of the speakers; voiceprint recognizing, namely comparing acquired characteristic parameters of the audio signals to be recognized with the speaker audio templates set up in the audio data base and judging according to weighting Euclidean distance measure, and if the corresponding speaker template enables the audio characteristic vector X of a speaker to be recognized to have the minimum average distance measure, the speaker is supposed to be recognized.
Owner:LIAONING UNIVERSITY OF TECHNOLOGY

Speaker verification system

A text-independent speaker verification system utilizes mel frequency cepstral coefficients analysis in the feature extraction blocks, template modeling with vector quantization in the pattern matching blocks, an adaptive threshold and an adaptive decision verdict and is implemented in a stand-alone device using less powerful microprocessors and smaller data storage devices than used by comparable systems of the prior art.
Owner:SAUDI ARABIAN OIL CO

Method and apparatus for speech encoding and decoding by sinusoidal analysis and waveform encoding with phase reproducibility

A speech encoding method and apparatus in which an input speech signal is divided in terms of blocks or frames as encoding units and encoded in terms of the encoding units, whereby explosive and fricative consonants can be impeccably reproduced, while there is an attenuation of the occurrence of foreign sounds being generated at a transient portion between voiced (V) and unvoiced (UV) portions, so that the speech with high clarity devoid of “stuffed” feeling may be produced. The encoding apparatus includes a first encoding unit for finding residuals of linear predictive coding (LPC) of an input speech signal for performing harmonic coding and a second encoding unit for encoding the input speech signal by waveform coding. The first encoding unit and the second encoding unit are used for encoding a voiced (V) portion and an unvoiced (UV) portion of the input signal, respectively. Code excited linear prediction (CELP) encoding employing vector quantization by a closed loop search of an optimum vector using an analysis-by-synthesis method is used for the second encoding unit. A corresponding decoding method and apparatus is also provided.
Owner:SONY CORP

Generation method of vector quantization code book

The invention provides a generation method of a vector quantization code book. In the method, a global optimization method based on a random relaxation technology is introduced; and while iteratively updating the code book every time, random disturbance is generated and added to a corresponding code word, thus local convergence is effectively avoided during the process of updating the code book. The method can further rationally optimize a code book structure and bit positions of the code word according to the inherent characteristic of channel statistical distribution of a wireless communication system and the requirement on orthogonality of user scheduling in a base station in an MIMO system on the selected users. In addition, a method for expanding the code book is also introduced in order to improve the robustness of the code book in a multi-channel statistical distribution environment, and the size of the code book can be flexibly adjusted according to specific conditions. As the antenna number of the base station and downlink user equipment is greatly increased based on the demand of the development of the future communication system, the method can generate reserved interfaces for the future code book, and can achieve higher quantization and system performance with lower complexity even though the method is used in high-dimensional vector quantization.
Owner:SHANGHAI JIAO TONG UNIV +1

Image processing device and image processing method

Provided is an image processing device including an acquiring section configured to acquire quantization matrix parameters from an encoded stream in which the quantization matrix parameters defining a quantization matrix are set within a parameter set which is different from a sequence parameter set and a picture parameter set, a setting section configured to set, based on the quantization matrix parameters acquired by the acquiring section, a quantization matrix which is used when inversely quantizing data decoded from the encoded stream, and an inverse quantization section configured to inversely quantize the data decoded from the encoded stream using the quantization matrix set by the setting section.
Owner:SONY CORP

Object Recognizer and Detector for Two-Dimensional Images Using Bayesian Network Based Classifier

A system and method for determining a classifier to discriminate between two classes—object or non-object. The classifier may be used by an object detection program to detect presence of a 3D object in a 2D image (e.g., a photograph or an X-ray image). The overall classifier is constructed of a sequence of classifiers (or “sub-classifiers”), where each such classifier is based on a ratio of two graphical probability models (e.g., Bayesian networks). A discrete-valued variable representation at each node in a Bayesian network by a two-stage process of tree-structured vector quantization is discussed. The overall classifier may be part of an object detector program that is trained to automatically detect many different types of 3D objects (e.g., human faces, airplanes, ears, etc.). Computationally efficient statistical methods to evaluate overall classifiers are disclosed. The Bayesian network-based classifier may also be used to determine if two observations (e.g., two images) belong to the same category. For example, in case of face recognition, the classifier may determine whether two photographs are of the same person. A method to provide lighting correction or adjustment to compensate for differences in various lighting conditions of input images is disclosed as well. As per the rules governing abstracts, the content of this abstract should not be used to construe the claims in this application.
Owner:GOOGLE LLC

Apparatus and method for video sensor-based human activity and facial expression modeling and recognition

An apparatus and method for human activity and facial expression modeling and recognition are based on feature extraction techniques from time sequential images. The human activity modeling includes determining principal components of depth and / or binary shape images of human activities extracted from video clips. Independent Component Analysis (ICA) representations are determined based on the principal components. Features are determined through Linear Discriminant Analysis (LDA) based on the ICA representations. A codebook is determined using vector quantization. Observation symbol sequences in the video clips are determined. And human activities are learned using the Hidden Markov Model (HMM) based on status transition and an observation matrix.
Owner:SAMSUNG ELECTRONICS CO LTD +1

Dynamic hand gesture recognition method based on self incremental learning of hidden Markov model

The invention discloses a dynamic hand gesture recognition method based on self incremental learning of the hidden Markov model. The method includes the following steps that firstly, a hand gesture is detected and tracked; secondly, feature extraction and vector quantization are carried out; thirdly, model training and hand gesture recognition are performed; fourthly, incremental learning is performed. According to the dynamic hand gesture recognition method based on self incremental learning of the hidden Markov model, dynamic hand gesture operation by a hand gesture operator in front of a camera can be accurately recognized, the recognized hand gesture data can be applied to incremental learning of an original model to adjust model parameters. Thus, the original model can dynamically adapt to novel variation generated in future hand gesture data and high adaptability to adjustment and alternation of the hand gesture data can be achieved. Thus, the model can be adjusted continuously along with the hand gesture data and better robustness on the unknown hand gesture recognition in the future is achieved.
Owner:NANJING UNIV

Reference database and method for determining spectra using measurements from an LED color sensor, and method of generating a reference database

To determine spectra, integrated multiple illuminant measurements from a non-fully illuminant populated color sensor may be converted into a fully populated spectral curve using a reference database. The reference database is partitioned into a plurality of clusters, and an appropriate centroid is determined for each cluster by, for example, vector quantization. Training samples that form the reference database may be assigned to the clusters by comparing the Euclidean distance between the centroids and the sample under consideration, and assigning each sample to the cluster having the centroid with the shortest Euclidean distance. When all training samples have been assigned, the resulting structure is stored as the reference database. When reconstructing the spectra for new measurements from the sensor, the Euclidean distances between actual color samples under measurement and each cluster centroid are measured. The spectra are then reconstructed using only the training samples from the cluster corresponding to the shortest Euclidean distance, resulting in improved speed and accuracy.
Owner:XEROX CORP

Energy based split vector quantizer employing signal representation in multiple transform domains

The invention relates to representation of one and multidimensional signal vectors in multiple nonorthogonal domains and design of Vector Quantizers that can be chosen among these representations. There is presented a Vector Quantization technique in multiple nonorthogonal domains for both waveform and model based signal characterization. An iterative codebook accuracy enhancement algorithm, applicable to both waveform and model based Vector Quantization in multiple nonorthogonal domains, which yields further improvement in signal coding performance, is disclosed. Further, Vector Quantization in multiple nonorthogonal domains is applied to speech and exhibits clear performance improvements of reconstruction quality for the same bit rate compared to existing single domain Vector Quantization techniques. The technique disclosed herein can be easily extended to several other one and multidimensional signal classes.
Owner:UNIV OF CENT FLORIDA RES FOUND INC +1

Split-vector quantization for speech signal involving out-of-sequence regrouping of sub-vectors

A method and apparatus for compressing and decompressing an audio signal. The apparatus comprises an input for receiving an audio signal derived from a spoken utterance, the audio signal being contained into a plurality of successive data frames. A data frame holding a certain portion the audio signal is processed to generate a feature vector including a plurality of discrete elements characterizing at least in part the portion of the audio signal encompassed by the frame, the elements being organized in a certain sequence. The apparatus makes use of a compressor unit having a grouping processor for grouping elements of the feature vector into a plurality of sub-vectors on the basis of a certain grouping scheme, at least one of the sub-vectors including a plurality of elements from the feature vector, the plurality of elements being out of sequence relative to the certain sequence. The plurality of sub-vectors are then quantized by applying a vector quantization method.
Owner:RPX CLEARINGHOUSE

Run Length Encoding in VLIW Architecture

A computer implemented method of video date encoding generates a mask having one bit corresponding each spatial frequency coefficient of a block during quantization. The bit state of the mask depends upon whether the corresponding quantized spatial frequency coefficient is zero or non-zero. The runs of zero quantized spatial frequency coefficients determined by a left most bit detect instruction are determined from the mask and run length encoded. The mask is generated using a look up table to map the scan order of quantization to the zig-zag order of run length encoding. Variable length coding and inverse quantization optionally take place within the run length encoding loop.
Owner:TEXAS INSTR INC

Video key frame extraction method based on color quantization and clusters

InactiveCN103065153ALow type dependencyAvoid redundant selectionCharacter and pattern recognitionTelevision systemsCanonical quantizationFrame difference
The invention discloses a video key frame extraction method based on color quantization and clusters. The method comprises the steps of loading video data flow; conducting single frame scanning on video flow; conducting the color quantization on obtained frame images, and extracting main color features of the frame images going through quantization; calculating similarity of adjacent frames so as to obtain adjacent frame difference; conducting shot boundary detection according to the adjacent frame difference; conducting shot classification on intersected shots and extracting a representative frame of each shot; and conducting compression clustering on the sequence of the representative frames so as to obtain a key frame sequence. According to the method, the color quantization is conducted on the single frame images so that main color of the images is extracted, frame difference calculation is conducted through the cluster feature similarity calculation method based on color features of the clusters so that the shot boundary detection is realized, and finally clustering according to the compression ratio is conducted on the extracted representative frames. Due to the fact that he whole process is low in dependency on video formats and types, the method has good universality and adaptability, is simple in calculation and low in space consumption, and can effectively avoid the phenomenon of key frame selection redundancy, control the number and quality of the key frames, and realize control of the video compression ratio.
Owner:SOUTHWEAT UNIV OF SCI & TECH

Sound source separation using convolutional mixing and a priori sound source knowledge

Sound source separation, without permutation, using convolutional mixing independent component analysis based on a priori knowledge of the target sound source is disclosed. The target sound source can be a human speaker. The reconstruction filters used in the sound source separation take into account the a priori knowledge of the target sound source, such as an estimate the spectra of the target sound source. The filters may be generally constructed based on a speech recognition system. Matching the words of the dictionary of the speech recognition system to a reconstructed signal indicates whether proper separation has occurred. More specifically, the filters may be constructed based on a vector quantization codebook of vectors representing typical sound source patterns. Matching the vectors of the codebook to a reconstructed signal indicates whether proper separation has occurred. The vectors may be linear prediction vectors, among others.
Owner:MICROSOFT TECH LICENSING LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products