[0021]An aspect of the invention provides a hardware logic system for communications among tasks of a software program, with such a system comprising: 1) a collection of source task specific buffers for buffering communication units, called packets, directed to a given task, referred to as a destination task, of the program, and 2) hardware logic for selecting a buffer among the collection of buffers from which to transfer a next packet to the destination task, with the selecting done at least in part based on a priority rank for each of the buffers. Various embodiments of that system comprise further features such as: a) a feature wherein the priority rank for a given one of the buffers is based at least in part on a prioritization of a source task that the source buffer is specific to, with the prioritization assigned by the destination task; b) a feature wherein the priority rank for a given one of the buffers is based at least in part on a measure of a fill level of the buffer, with the measure of the fill level of the buffer comprising an indication of whether the buffer (i) is non-empty or (ii) has its fill level above a defined monitoring threshold; and / or c) hardware logic maintaining a hardware signal indicating whether communications to a given buffer, among the source task specific buffers, is permitted presently, with such logic setting the signal in a state indicating said communications being permitted when a fill level of the buffer is below a defined threshold, and with said signal being provided to that one of the tasks of the program for which the given buffer is specific to.
[0022]Another aspect of the invention provides a method for prioritizing communications among tasks of a software program. Such a method involves: 1) determining, for a given one of the tasks, referred to as a destination task, from which of the other tasks, referred to as source tasks, the destination task is expecting input data, and 2) assigning a prioritization for one or more of the source tasks, for purposes of transferring communications to the destination task, based at least in part on the determining. Various embodiments of this method provide further steps and features such as a) a feature whereby the assigning the prioritization is done in a manner selected from a set comprising: (i) setting a hardware signal associated with a given one of the source tasks to a state that represents the prioritization of the given source task, (ii) setting a hardware signal associated with a given one of the source tasks to a binary state that indicates whether or not the given source task has a high priority for the purposes of transferring communications to the destination task, and (iii) setting a hardware signal to a state that identifies which one of the source tasks has the highest priority for the purposes of transferring communications to the destination task; b) a feature whereby the assigning the prioritization involves writing, by the destination task, a value to a hardware device register associated with a given one of its source tasks, with said value specifying the prioritization of the given source task for purposes of sending communications to the destination task; c) a step of multiplexing communications data units to the destination task from source task specific buffers, wherein which of the source task specific buffers is selected as the one from which a next one of the data units is multiplexed to the destination task is determined at least in part based on the prioritizations of those of the source tasks that at that time have data at their associated buffers; d) a feature whereby the assigning is done according to which of the following classes, listed in their descending priority order, any given one of the source tasks belongs to: (1) source tasks from which new data is expected by the destination task, and (2) any other source tasks; and / or e) a feature wherein the other source tasks are further classified into the following subclasses, listed in their descending priority order: (i) tasks from which more data is allowed by the destination task, and (ii) any remaining tasks.
[0023]A yet another aspect of the invention provides a hardware logic system for prioritizing instances of a software program for execution. Such a system comprises: 1) hardware logic for determining which of the instances are ready to execute on an array of processing cores, at least in part based on whether a given one of the instances has available to it input data to process, and 2) hardware logic for assigning a subset of the instances for execution on the array of cores based at least in part on the determining. Various embodiments of that system include further features such as features whereby a) the input data is from a data source such that the given instance has assigned a high priority for purposes of receiving data; b) the input data is such data that it enables the given program instance to execute; c) the subset includes cases of none, some as well as all of the instances of said program; and / or d) the instance is: a task, a thread, an actor, or an instance any of the foregoing, or an independent copy of the given program.
[0024]A yet another aspect of the invention provides a hardware logic implemented method for prioritizing instances of a software program for execution, with such a method involving: 1) classifying instances of the program into the following classes, listed in their reducing execution priority order: (I) instances indicated as having high priority input data for processing, and (II) any other instances. Various embodiments of that method include further steps and features such as features whereby a) the other instances are further classified into the following sub-classes, listed in their reducing execution priority order: (i) instances indicated as able to execute presently without the high priority input data; and (ii) any remaining instances; b) the high priority input data is data that is from a source where its destination instance, of said program, is expecting high priority input data; c) a given instance of the program comprises tasks, with one of said tasks referred to as a destination task and others as source tasks of the given instance, and for the given instance, a unit of the input data is considered high priority if it is from such one of the source tasks that the destination task has assigned a high priority for inter-task communications to it; d) for any given one of the instances, a step of computing a number of its non-empty source task specific buffers among its input data buffers such that belong to source tasks of the given instance indicated at the time as high priority source tasks for communications to the destination task of the given instance, with this number referred to as an H number for its instance, and wherein, within the class I), the instances are prioritized for execution at least in part according to magnitudes of their H numbers, in descending order such that an instance with a greater H number is prioritized before an instance with lower H number; and / or e) in case of two or more of the instances tied for the greatest H number, such tied instances are prioritized at least in part according to their respective total numbers of non-empty input data buffers.
[0025]An aspect of the invention provides a system for processing a set of computer programs instances, with inter-task communications (ITC) performance isolation among the set of program instances. Such a system comprises: 1) a number of processing stages; and 2) a group of multiplexers connecting ITC data to a given stage among the processing stages, wherein a multiplexer among said group is specific to one given program instance among said set. The system hosts each task of the given program instance at different one of the processing stages, and supports copies of same task software code being located at more than one of the processing stages in parallel. Various embodiments of this system include further features such as a) a feature whereby at least one of processing stages comprises multiple processing cores such as CPU cores, with, for any of the cores, at any given time, one of the program instances assigned for execution; b) a set of source task specific buffers for buffering data destined for a task of the given program instance located at the given stage, referred to as a destination task, and hardware logic for forming a hardware signal indicating whether sending ITC is presently permitted to a given buffer among the source task specific buffers, with such forming based at least in part on a fill level of the given buffer, and with such signal being provided for a source task for which the given buffer is specific to; c) a feature providing, for the destination task, a set of source task specific buffers, wherein a given buffer is specific to one of the other tasks of the program instance for buffering ITC from said other task to the destination task; d) feature wherein the destination task provides ITC prioritization information for other tasks of the program instance located at their respective ones of the stages; d) a feature whereby the ITC prioritization information is provided by the destination task via a set of one or more hardware registers, with each register of the set specific to one of the other tasks of the program instance, and with each register configured to store a value specifying a prioritization level of the task that it is specific to, for purposes of ITC communications to the destination task; e) an arbitrator controlling from which source task of the program instance the multiplexer specific to that program instance will read its next ITC data unit; and / or f) a feature whereby the arbitrator prioritizes source tasks of the program instance for selection by the multiplexer to read its next ITC data unit based at least in part on at least one of: (i) source task specific ITC prioritization information provided by the destination task, and (ii) source task specific availability information of ITC data for the destination task from the other tasks of the program instance.
[0026]Accordingly, aspects of the invention involve application-program instance specific hardware logic resources for secure and reliable ITC among tasks of application program instances hosted at processing stages of a multi-stage parallel processing system. Rather than seeking to inter-connect the individual processing stages or cores of the multi-stage manycore processing system as such, the invented mechanisms efficiently inter-connect the tasks of any given application program instance using the per application program instance specific inter-processing stage ITC hardware logic resources. Due to the ITC being handled with such application program instance specific hardware logic resources, the ITC performance experience by one application instance does not depend on the ITC resource usage (e.g. data volume and inter-task communications intensiveness) of the other applications sharing the given data processing system per the invention. This results in effective inter-application isolation for ITC in a multi-stage parallel processing system shared dynamically among multiple application programs.