Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

680 results about "Matrix multiplication" patented technology

In mathematics, matrix multiplication or matrix product is a binary operation that produces a matrix from two matrices with entries in a field, or, more generally, in a ring or even a semiring. The matrix product is designed for representing the composition of linear maps that are represented by matrices. Matrix multiplication is thus a basic tool of linear algebra, and as such has numerous applications in many areas of mathematics, as well as in applied mathematics, statistics, physics, economics, and engineering.

Quantum-key-distribution-network-based mobile encryption system and communication method thereof

ActiveCN102196425AReduce computationGuaranteed distribution securitySecurity arrangementPlaintextTelecommunications
The invention discloses a quantum-key-distribution-network-based mobile encryption system and a communication method thereof. The method comprises that: a mobile terminal is registered in a network; the registered mobile terminal is connected with any quantum terminal by a key updating interface, and applies for the downloading of shared keys in a certain data volume to the quantum terminal; after the mobile terminal downloads the keys, the quantum terminal transmits a quantum centralized control station address to the mobile terminal for updating, and the mobile terminal takes a centralized control station on the quantum centralized control station address as a calling centralized control station; after the calling centralized control station is determined, the mobile terminal submits a cipher text to the calling centralized control station; the calling centralized control station re-encrypts the cipher text, and transmits the re-encrypted cipher text to a called centralized control station; the called centralized control station re-encrypts the cipher text, and transmits the re-encrypted cipher text to a called user; and after the called user decrypts the re-encrypted cipher text to obtain a plaintext, the communication is finished. In the method, the encryption does not require multiple matrix multiplication operations, so the computational load of the encryption is greatlyreduced; and simultaneously, the key distribution security of the highest level can be ensured in the key distribution of a quantum key distribution network.
Owner:QUANTUMCTEK +1

Matrix multiplication in a vector processing system

The present invention is directed to a system and method for multiplication of matrices in a vector processing system. Partial products are obtained by dot multiplication of vector registers containing multiple copies of elements of a first matrix and vector registers containing values from rows of a second matrix. The dot products obtained from this dot multiplication are subsequently added to vector registers which make up a product matrix. In an embodiment of the present invention, each matrix may be divided into submatrices to facilitate the rapid and efficient multiplication of large matrices, which is done in parts by computing partial products of each submatrix. The matrix multiplication performed by the present invention avoids rounding errors as it is bit-by-bit compatible with conventional matrix multiplication methods.
Owner:APPLE INC

Efficient matrix multiplication on a parallel processing device

The present invention enables efficient matrix multiplication operations on parallel processing devices. One embodiment is a method for mapping CTAs to result matrix tiles for matrix multiplication operations. Another embodiment is a second method for mapping CTAs to result tiles. Yet other embodiments are methods for mapping the individual threads of a CTA to the elements of a tile for result tile computations, source tile copy operations, and source tile copy and transpose operations. The present invention advantageously enables result matrix elements to be computed on a tile-by-tile basis using multiple CTAs executing concurrently on different streaming multiprocessors, enables source tiles to be copied to local memory to reduce the number accesses from the global memory when computing a result tile, and enables coalesced read operations from the global memory as well as write operations to the local memory without bank conflicts.
Owner:NVIDIA CORP

Method and apparatus for computing matrix transformations

A method and apparatus for performing matrix transformations including multiply-add operations and byte shuffle operations on packed data in a processor. In one embodiment, two rows of content byte elements are shuffled to generate a first and second packed data respectively including elements of a first two columns and of a second two columns. A third packed data including sums of products is generated from the first packed data and elements from two rows of a matrix by a multiply-add instruction. A fourth packed data including sums of products is generated from the second packed data and elements from two more rows of the matrix by another multiply-add instruction. Corresponding sums of products of the third and fourth packed data are then summed to generate two rows of a product matrix. Elements of the product matrix may be generated in an order that further facilitates a second matrix multiplication.
Owner:INTEL CORP

Mapping the threads of a CTA to the elements of a tile for efficient matrix multiplication

The present invention enables efficient matrix multiplication operations on parallel processing devices. One embodiment is a method for mapping CTAs to result matrix tiles for matrix multiplication operations. Another embodiment is a second method for mapping CTAs to result tiles. Yet other embodiments are methods for mapping the individual threads of a CTA to the elements of a tile for result tile computations, source tile copy operations, and source tile copy and transpose operations. The present invention advantageously enables result matrix elements to be computed on a tile-by-tile basis using multiple CTAs executing concurrently on different streaming multiprocessors, enables source tiles to be copied to local memory to reduce the number accesses from the global memory when computing a result tile, and enables coalesced read operations from the global memory as well as write operations to the local memory without bank conflicts.
Owner:NVIDIA CORP

System, transmitter, method, and computer program product for utilizing an adaptive preamble scheme for multi-carrier communication systems

A system, transmitter, method, and computer program product apply a performance improvement characteristic, such as phase rotation or power allocation, to both a known preamble and a data payload of a transmitted data packet, such that existing multi-carrier receivers are capable of decoding the data payload with the performance improvement characteristic applied. The performance improvement characteristic is applied by vector-matrix multiplication of the preamble and the data payload by the performance improvement characteristic.
Owner:NOKIA CORP

High speed and efficient matrix multiplication hardware module

A matrix multiplication module and matrix multiplication method are provided that use a variable number of multiplier-accumulator units based on the amount of data elements of the matrices are available or needed for processing at a particular point or stage in the computation process. As more data elements become available or are needed, more multiplier-accumulator units are used to perform the necessary multiplication and addition operations. To multiply an N×M matrix by an M×N matrix, the total (maximum) number of used MAC units is “2*N−1”. The number of MAC units used starts with one (1) and increases by two at each computation stage, that is, at the beginning of reading of data elements for each new row of the first matrix. The sequence of the number of MAC units is {1, 3, 5, . . . , 2*N−1} for computation stages each of which corresponds to reading of data elements for each new row of the left hand matrix, also called the first matrix. For the multiplication of two 8×8 matrices, the performance is 16 floating point operations per clock cycle. For an FPGA running at 100 MHz, the performance is 1.6 Giga floating point operations per second. The performance increases with the increase of the clock frequency and the use of larger matrices when FPGA resources permit. Very large matrices are partitioned into smaller blocks to fit in the FPGA resources. Results from the multiplication of sub-matrices are combined to form the final result of the large matrices.
Owner:HARRIS CORP

Matrix multiplication acceleration method for supporting variable blocks

The invention discloses a matrix multiplication acceleration method for supporting variable blocks. The steps include: a matrix A and a matrix B are inputted, the size Si of a subblock is determined according to the scales of the matrix A and the matrix B, the matrix A is partitioned in lines regarding the subblock with the scale of Si*N as the unit, the matrix B is partitioned in rows regarding the subblock with the scale of N*Si as the unit, a DMA descriptor is generated for required data of multiplication operation of each subblock, all the DMA descriptors are constructed to a DMA descriptor list, for the multiplication operation of each subblock, the required data of the multiplication operation of the subblocks is read according to the DMA descriptor list in a main memory, the multiplication operation of the subblocks is conducted via a processing unit chain of a matrix multiplication accelerator, and the result is written back to the main memory via the DMA. The method is advantageous in that variable blocks can be supported, the number of employed processing units can be adjusted according to the size of the blocks, and the acceleration efficiency for accelerating the multiplication operation of non-uniform matrixes is high.
Owner:NAT UNIV OF DEFENSE TECH

FPGA (Field Programmable Gate Array)-based general matrix fixed-point multiplier and calculation method thereof

The invention discloses an FPGA (Field Programmable Gate Array)-based general matrix fixed-point multiplier. An internal structure of the multiplier consists of a control module, a conversion module, an operation module and a storage module. The control module is used for generating a control signal according to dimension of a to-be-operated matrix. The conversion module is responsible for performing conversion between a fixed-point number and a floating-point number during operation. The operation module is used for reading operation data from the storage module and the conversion module, performing fixed-point multiplication and fixed-point accumulating operation and storing a result in the storage module. The storage module is used for caching to-be-operated matrix data and result matrix data, providing an interface compatible with a bus signal and allowing access of other components on a bus. The characteristic of high fixed-point calculation efficiency in hardware is fully utilized; by using a unique operation structure, simultaneous conversion and operation of the data are realized to improve the overall operation speed, and a plurality of matrix fixed-point multipliers can be simultaneously used to perform parallel calculation; thus the fixed-point multiplication of an arbitrary dimension matrix can be supported, and meanwhile extremely high calculation efficiency is guaranteed. Compared with matrix multiplication performed by using the floating-point number, the multiplier has the advantage that the calculation efficiency is greatly improved.
Owner:上海碧帝数据科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products