The present invention provides a network multithreaded processor, such as a
network processor, including a thread interleaver that implements fine-grained thread decisions to avoid underutilization of instruction execution resources in
spite of large communication latencies. In an upper pipeline, an
instruction unit determines an-instruction fetch sequence responsive to an instruction
queue depth on a per thread basis. In a lower pipeline, a thread interleaver determines a thread
interleave sequence responsive to thread conditions including thread latency conditions. The thread interleaver selects threads using a two-level round robin arbitration. Thread latency signals are active responsive to thread latencies such as thread stalls, cache misses, and interlocks. During the subsequent one or more
clock cycles, the thread is ineligible for arbitration. In one embodiment, other thread conditions affect selection decisions such as local priority, global stalls, and late stalls.