Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

3375 results about "Operand" patented technology

In mathematics an operand is the object of a mathematical operation, i.e., it is the object or quantity that is operated on.

Speculative execution and rollback

One embodiment of the present invention sets forth a technique for speculatively issuing instructions to allow a processing pipeline to continue to process some instructions during rollback of other instructions. A scheduler circuit issues instructions for execution assuming that, several cycles later, when the instructions reach multithreaded execution units, that dependencies between the instructions will be resolved, resources will be available, operand data will be available, and other conditions will not prevent execution of the instructions. When a rollback condition exists at the point of execution for an instruction for a particular thread group, the instruction is not dispatched to the multithreaded execution units. However, other instructions issued by the scheduler circuit for execution by different thread groups, and for which a rollback condition does not exist, are executed by the multithreaded execution units. The instruction incurring the rollback condition is reissued after the rollback condition no longer exists.
Owner:NVIDIA CORP

SIMD processor and addressing method

A single instruction, multiple data (SIMD) processor including a plurality of addressing register sets, used to flexibly calculate effective operand source and destination memory addresses is disclosed. Two or more address generators calculate effective addresses using the register sets. Each register set includes a pointer register, and a scale register. An address generator forms effective addresses from a selected register set's pointer register and scale register; and an offset. For example, the effective memory address may be formed by multiplying the scale value by an offset value and summing the pointer and the scale value multiplied by the offset value.
Owner:AVAGO TECH INT SALES PTE LTD

Processor-cache system and method

A digital system is provided. The digital system includes an execution unit, a level-zero (L0) memory, and an address generation unit. The execution unit is coupled to a data memory containing data to be used in operations of the execution unit. The L0 memory is coupled between the execution unit and the data memory and configured to receive a part of the data in the data memory. The address generation unit is configured to generate address information for addressing the L0 memory. Further, the L0 memory provides at least two operands of a single instruction from the part of the data to the execution unit directly, without loading the at least two operands into one or more registers, using the address information from the address generation unit.
Owner:SHANGHAI XINHAO MICROELECTRONICS

Processing architecture having a compare capability

According to the invention, a processing core that executes a compare instruction is disclosed. The processing core includes a register file, comparison logic, decode logic, and a store path. Included in the register file are a number of general-purpose registers. The general-purpose registers include a first input operand register, a second input operand register and an output operand register. Comparison logic is coupled to the register file. The comparison logic tests for at least two of the following relationships: less than, equal to, greater than and no valid relationship. The decode logic selects the output operand register from the plurality of general-purpose registers. The store path extends between the comparison logic and the selected output operand register.
Owner:ORACLE INT CORP

Multiprocessor with each processor element accessing operands in loaded input buffer and forwarding results to FIFO output buffer

An enhanced memory algorithmic processor ("MAP") architecture for multiprocessor computer systems comprises an assembly that may comprise, for example, field programmable gate arrays ("FPGAs") functioning as the memory algorithmic processors. The MAP elements may further include an operand storage, intelligent address generation, on board function libraries, result storage and multiple input / output ("I / O") ports. The MAP elements are intended to augment, not necessarily replace, the high performance microprocessors in the system and, in a particular embodiment of the present invention, they may be connected through the memory subsystem of the computer system resulting in it being very tightly coupled to the system as well as being globally accessible from any processor in a multiprocessor computer system.
Owner:SRC COMP

Method and system of valuing transformation between extensible markup language (XML) documents

A method and system of valuing transformation between XML documents. Specifically, one embodiment of the present invention discloses a method for calculating a transformation cost for a transformation operation that transforms a source node in a source XML document to a target node in a target XML document. A data loss and potential data loss is measured for the transformation operation. Also, the operands in the transformation operation are scaled to measure their impact on the data loss and potential data loss. A transformation cost is calculated by considering the data loss, potential data loss, and scaling.
Owner:HEWLETT-PACKARD ENTERPRISE DEV LP

Processes, circuits, devices, and systems for scoreboard and other processor improvements

A method of instruction issue (3200) in a microprocessor (1100, 1400, or 1500) with execution pipestages (E1, E2, etc.) and that executes a producer instruction Ip and issues a candidate instruction I0 (3245) having a source operand dependency on a destination operand of instruction Ip. The method includes issuing the candidate instruction I0 as a function (1720, 1950, 1958, 3235) of a pipestage EN(I0) of first need by the candidate instruction for the source operand, a pipestage EA(Ip) of first availability of the destination operand from the producer instruction, and the one execution pipestage E(Ip) currently associated with the producer instruction. A method of data forwarding (3300) in a microprocessor (1100, 1400, or 1500) having a pipeline (1640) having pipestages (E1, E2, etc.), wherein the method includes scoreboarding information E(Ip) (1710, 2220) to represent a changing pipestage position for data from a producer instruction Ip, and selectively forwarding (2310, 3360) the data from the pipestage having the represented pipestage position E(Ip), based on the information (1710), to a receiving pipestage (1682, E1) for a dependent instruction. Wireless communications devices (1010, 1010′, 1040, 1050, 1060, 1080), systems, circuits, devices, scoreboards (1700.N), processes and methods of operation, processes and articles of manufacture (FIGS. 13-16), are also disclosed.
Owner:TEXAS INSTR INC

Native copy instruction for file-access processor with copy-rule-based validation

A copy instruction executed by a functional-level instruction-set computing (FLIC) processor copies a variable-length data block from one resource to another resource through a cross-bar switch. Resources include general-purpose registers, input, output, and execution buffers, DRAM, SRAM, and other memory. A copy-with-validate instruction has an operand pointing to a first rule in an immediate rule table. The first rule controls validation of a first data-item in the data being copied. Validation includes range and equality checking of the data-item. The value of the data-item or the current offset can be written to a register. A format field in the rule indicates the size of the data-item, or the size is read from the data-item for variable-size formats. The current offset is incremented by the size. The next data-item is validated by a next rule, and other rules in the immediate table control validation of other data-items in the data block.
Owner:RPX CORP

Configurable system for performing repetitive actions and method for configuring and operating same

In some embodiments, a data processing system including an operation unit including circuitry configurable to perform any selected one of a number of operations on data (e.g., audio data) and a configuration unit configured to assert configuration information to configure the operation unit to perform the selected operation. When the operation includes matrix multiplication of a data vector and a matrix whose coefficients exhibit symmetry, the configuration information preferably includes bits that determine signs of all but magnitudes of only a subset of the coefficients. When the operation includes successive addition and subtraction operations on operand pairs, the configuration information preferably includes bits that configure the operation unit to operate in an alternating addition / subtraction mode to perform successive addition and subtraction operations on each pair of data values of a sequence of data value pairs. In some embodiments, the configuration information includes bits that configure the operation unit to operate in a non-consecutive (e.g., butterfly or bit-reversed) addressing mode to access memory locations having consecutive addresses in a predetermined non-consecutive sequence. Other aspects are audio encoders and decoders including any embodiment of, and configuration units and operation units for use in, any embodiment of the system, and methods performed during operation of any embodiment of the system or configuration or operation unit thereof.
Owner:NVIDIA CORP

Multiplier accumulator circuits

A multiply-accumulate (MAC) unit, having a first binary operand X, a second binary operand Y, a third binary operand, Booth recode logic for generating a plurality of partial products from said first and second operands, a Wallace tree adder for reducing the partial products and for selectively arithmetically combining the reduced partial products with said third operand, a final adder for generating a final sum, and a saturation circuitry for selectively rounding or saturating said final sum is provided. A dual MAC unit is also provided.
Owner:TEXAS INSTR INC

Method and apparatus for multi-function arithmetic

A multiplier capable of performing signed and unsigned scalar and vector multiplication is disclosed. The multiplier is configured to receive signed or unsigned multiplier and multiplicand operands in scalar or packed vector form. An effective sign for the multiplier and multiplicand operands may be calculated and used to create and select a number of partial products according to Booth's algorithm. Once the partial products have been created and selected, they may be summed and the results may be output. The results may be signed or unsigned, and may represent vector or scalar quantities. When a vector multiplication is performed, the multiplier may be configured to generate and select partial products so as to effectively isolate the multiplication process for each pair of vector components. The multiplier may also be configured to sum the products of the vector components to form the vector dot product. The final product may be output in segments so as to require fewer bus lines. The segments may be rounded by adding a rounding constant. Rounding and normalization may be performed in two paths, one assuming an overflow will occur, the other assuming no overflow will occur. The multiplier may also be configured to perform iterative calculations to evaluate constant powers of an operand. Intermediate products that are formed may be rounded and normalized in two paths and then compressed and stored for use in the next iteration. An adjustment constant may also be added to increase the frequency of exactly rounded results.
Owner:ADVANCED SILICON TECH

Programmable processor with group floating-point operations

A programmable processor that comprises a general purpose processor architecture, capable of operation independent of another host processor, having a virtual memory addressing unit, an instruction path and a data path; an external interface; a cache operable to retain data communicated between the external interface and the data path; at least one register file configurable to receive and store data from the data path and to communicate the stored data to the data path; and a multi-precision execution unit coupled to the data path. The multi-precision execution unit is configurable to dynamically partition data received from the data path to account for an elemental width of the data and is capable of performing group floating-point operations on multiple operands in partitioned fields of operand registers and returning catenated results. In other embodiments the multi-precision execution unit is additionally configurable to execute group integer and / or group data handling operations.
Owner:MICROUNITY

Method, system, and apparatus for efficient evaluation of boolean expressions

Methods, systems, and computer-readable media are provided for efficiently evaluation Boolean expressions. According to the method, the Boolean expression is expressed using pre-fix notation. Each element in the pre-fix expression is then parsed. For each first operand for a Boolean operation, the value of the operand is determined. This may include evaluating a GUID. When an operator and a second operand are encountered, a decision is made as to whether the second operand should be evaluated. The determination as to whether the second operand should be evaluated is made based upon the value of the first operand and the type of operator. If the second operand need not be evaluated, no evaluation is performed thereby saving time and memory space. The evaluation of the Boolean expression continues in this manner until the entire expression has been evaluated. If the Boolean expression is evaluated as true, the program module associated with the Boolean expression may be loaded. Otherwise, the program module will not be loaded.
Owner:AMERICAN MEGATRENDS

Tile-based processor architecture model for high-efficiency embedded homogeneous multicore platforms

The present invention relates to a processor which comprises processing elements that execute instructions in parallel and are connected together with point-to-point communication links called data communication links (DCL). The instructions use DCLs to communicate data between them. In order to realize those communications, they specify the DCLs from which they take their operands, and the DCLs to which they write their results. The DCLs allow the instructions to synchronize their executions and to explicitly manage the data they manipulate. Communications are explicit and are used to realize the storage of temporary variables, which is decoupled from the storage of long-living variables.
Owner:MANET PHILIPPE +1

Scalar hardware for performing SIMD operations

A system for processing SIMD operands in a packed data format includes a scalar FMAC and a vector FMAC coupled to a register file through an operand delivery module. For vector operations, the operand delivery module bit steers a SIMD operand of the packed operand into an unpacked operand for processing by the first execution unit. Another SIMD operand is processed by the vector execution unit.
Owner:INTEL CORP

Central processing unit (CPU) accessing an extended register set in an extended register mode

A central processing unit (CPU) is described including a register file and an execution core coupled to the register file. The register file includes a standard register set and an extended register set. The standard register set includes multiple standard registers, and the extended register set include multiple extended registers. The execution core fetches and executes instructions, and receives a signal indicating an operating mode of the CPU. The execution core responds to an instruction by accessing at least one extended register if the signal indicates the CPU is operating in an extended register mode and the instruction includes a prefix portion including information needed to access the at least one extended register. The standard registers may be general purpose registers of a CPU architecture associated with the instruction. The number of extended registers may be greater than the number of general purpose registers defined by the CPU architecture. In this case, the additional register identification information in the prefix portion is needed to identify a selected one of the extended registers. A width of the extended registers may also be greater than a width of the standard registers. In this case, the prefix portion may also include an indication that the entire contents of the least one extended register is to be accessed. In this way, instruction operand sizes may selectively be increased when the CPU is operating in the extended register mode. A computer system including the CPU is also described.
Owner:GLOBALFOUNDRIES INC

Method for translating programs for reconfigurable architectures

A method for translating high-level languages to reconfigurable architectures is disclosed. The method includes building a finite automaton for calculation. The method further includes forming a combinational network of a plurality of individual functions in accordance with the structure of the finite automaton. The method further includes allocating a plurality of memories to the network for storing a plurality of operands and a plurality of results.
Owner:PACT INFORMATIONSTECH +1

Method of optimizing SQL queries where a predicate matches nullable operands

An optimization technique for SQL queries, a program storage device storing the optimization program, and an apparatus for optimizing a query is provided. A query is analyzed to determine whether it includes a predicate for matching nullable operands and, if so, it is transformed to return TRUE when all operands are NULLs. If the DBMS supports this new function, the predicate is marked. If not, the predicate is re-written into a CASE expression having two SELECT clauses. The query is then executed in the computer to efficiently retrieve data from the relational database.
Owner:TWITTER INC

Method and apparatus for performing multiple types of multiplication including signed and unsigned multiplication

A multiplier capable of performing both signed and unsigned scalar and vector multiplication is disclosed. The multiplier is configured for use in a microprocessor and may include a partial product generator, a selection logic unit, and an adder. The multiplier is configured to receive signed or unsigned multiplier and multiplicand operands in scalar or packed vector form. The multiplier is also configured to receive a first control signal indicative of whether signed or unsigned multiplication is to be performed and a second control signal indicative of whether vector multiplication is to be performed. The multiplier is configured to calculate an effective sign for the multiplier and multiplicand operands based upon each operand's most significant bit and the control signal. The effective signs may then be used by the partial product generation unit and the selection logic to create and select a number of partial products according to Booth's algorithm. Once the partial products have been created and selected, the adder is configured to sum them and output the results, which may be signed or unsigned. When a vector multiplication is performed, the multiplier is configured to generate and select partial products so as to effectively isolate the multiplication process for each pair of vector components.
Owner:ADVANCED MICRO DEVICES INC

Fast just-in-time (JIT) scheduler

A just-in-time (JIT) compiler typically generates code from bytecodes that have a sequence of assembly instructions forming a "template". It has been discovered that a just-in-time (JIT) compiler generates a small number, approximately 2.3, assembly instructions per bytecode. It has also been discovered that, within a template, the assembly instructions are almost always dependent on the next assembly instruction. The absence of a dependence between instructions of different templates is exploited to increase the size of issue groups using scheduling. A fast method for scheduling program instructions is useful in just-in-time (JIT) compilers. Scheduling of instructions is generally useful for just-in-time (JIT) compilers that are targeted to in-order superscalar processors because the code generated by the JIT compilers is often sequential in nature. The disclosed fast scheduling method has a complexity, and therefore an execution time, that is proportional to the number of instructions in an instruction block (N complexity), a substantial improvement in comparison to the N2 complexity of conventional compiler schedulers. The described fast scheduler advantageously reorders instructions with a single pass, or few passes, through a basic instruction block while a conventional compiler scheduler such as the DAG scheduler must iterate over an instruction basic block many times. A fast scheduler operates using an analysis of a sliding window of three instructions, applying two rules within the three instruction window to determine when to reorder instructions. The analysis includes acquiring the opcodes and operands of each instruction in the three instruction window, and determining register usage and definition of the operands of each instruction with respect to the other instructions within the window. The rules are applied to determine ordering of the instructions within the window.
Owner:ORACLE INT CORP

Shared FP and SIMD 3D multiplier

A multiplier configured to perform multiplication of both scalar floating point values (XxY) and packed floating point values (i.e., X1xY1 and X2xY2). In addition, the multiplier may be configured to calculate XxY-Z. The multiplier comprises selection logic for selecting source operands, a partial product generator, an adder tree, and two or more adders configured to sum the results from the adder tree to achieve a final result. The multiplier may also be configured to perform iterative multiplication operations to implement such arithmetical operations such as division and square root. The multiplier may be configured to generate two versions of the final result, one assuming there is an overflow, and another assuming there is not an overflow. A computer system and method for performing multiplication are also disclosed.
Owner:ADVANCED SILICON TECH

Programmable logic datapath that may be used in a field programmable device

A method and apparatus for providing a programmable logic datapath that may be used in a field programmable device. According to one aspect of the invention, a programmable logic datapath is provided that includes a plurality of logic elements to perform various (Boolean) logic operations. The programmable logic datapath further includes circuitry to selectively route and select operand bits between the plurality of logic elements (operand bits is used hereinafter to refer to input bits, logic operation result bits, etc., that may be generated within the logic datapath). In one embodiment, by providing control bits concurrently with operand bits to routing and selection (e.g., multiplexing) circuitry, the programmable logic datapath of the invention can provide dynamic programmability to perform a number of logic operations on inputs of various lengths on a cycle-by-cycle basis.
Owner:PMC-SIERRA

Coherency techniques for suspending execution of a thread until a specified memory access occurs

Coherency techniques for suspending execution of a thread until a specified memory access occurs. In one embodiment, a processor includes a cache, execution logic to execute an instruction having an operand indicating a monitor address and a bus controller. In one embodiment, the bus controller is to assert a preventative signal in response to receiving a memory access attempting to gain sufficient ownership of a cache line associated with said monitor address to allow modification of said cache line without generation of another transaction indicative of the modification. In another embodiment, the bus controller is to generate a bus cycle in response to the instruction to eliminate any ownership of the cache line by another processor that would allow a modification of the cache line without generation of another memory access indicative of the modification.
Owner:INTEL CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products