The high-performance, RISC core based
microprocessor architecture includes an instruction fetch unit for fetching instruction sets from an instruction store and an
execution unit that implements the concurrent execution of a plurality of instructions through a
parallel array of functional units. The fetch unit generally maintains a predetermined number of instructions in an
instruction buffer. The
execution unit includes an
instruction selection unit, coupled to the
instruction buffer, for selecting instructions for execution, and a plurality of functional units for performing instruction specified functional operations. A unified instruction scheduler, within the
instruction selection unit, initiates the
processing of instructions through the functional units when instructions are determined to be available for execution and for which at least one of the functional units implementing a necessary computational function is available. Unified scheduling is performed across multiple execution data paths, where each execution
data path, and corresponding functional units, is generally optimized for the type of computational function that is to be performed on the data: integer,
floating point, and boolean. The number, type and computational specifics of the functional units provided in each
data path, and as between data paths, are mutually independent.