A general computer-implement method and apparatus to optimize problem
layout on a
massively parallel supercomputer is described. The method takes as input the communication matrix of an arbitrary problem in the form of an array whose entries C(i, j) are the amount to data communicated from domain i to domain j. Given C(i, j), first implement a
heuristic map is implemented which attempts sequentially to map
a domain and its communications neighbors either to the same
supercomputer node or to near-neighbor nodes on the
supercomputer torus while keeping the number of domains mapped to a supercomputer node constant (as much as possible). Next a
Markov Chain of maps is generated from the initial map using Monte Carlo
simulation with Free Energy (cost function) F=Σi,jC(i,j)H(i,j)− where H(i,j) is the smallest number of hops on the supercomputer torus between domain i and domain j. On the cases tested, found was that the method produces good mappings and has the potential to be used as a general
layout optimization tool for parallel codes. At the moment, the
serial code implemented to test the method is un-optimized so that computation time to find the optimum map can be several hours on a typical PC. For production implementation, good parallel code for our
algorithm would be required which could itself be implemented on supercomputer.