Design, development and evaluation of an efficient hierarchical interconnection network.
MetadataShow full item record
Parallel computing has long been an area of research interest because exploiting parallelism in difficult problems has promised to deliver orders of magnitude speedups. Processors are now both powerful and cheap, so that systems incorporating tens, hundreds or even thousands of powerful processors need not be prohibitively expensive. The weak link in exploiting parallelism is the means of communication between the processors. Shared memory systems are fundamentally limited in the number of processors they can utilise. To achieve high levels of parallelism it is still necessary to use distributed memory and some form of interconnection network. But interconnection networks can be costly, slow, difficult to build and expand, vulnerable to faults and limited in the range of problems they can be used to solve effectively. As a result there has been extensive research into developing interconnection networks which overcome some or all of these difficulties. In this thesis it is argued that a new interconnection network, Hierarchical Cliques (HiC), and a derivative, FatHiC, possesses many desirable properties and are worthy of consideration for use in building parallel computers. A fundamental element of an interconnection network is its topology. After defining the topology of HiC, expressions are derived for the various parameters which define its underlying limits of performance and fault tolerance. A second element of an interconnection network is an addressing and routing scheme. The addressing scheme and routing algorithms of HiC are described. The flexibility of HiC is demonstrated by developing embeddings of popular, regular interconnection networks. Some embeddings into HiC suffer from high congestion, however the FatHiC network is shown to have low congestion for those embeddings. The performance of some important, regular, data parallel problems on HiC and FatHiC are determined by analysis and simulation, using the 2D-mesh as a means of comparison. But performance alone does not tell the whole story. Any parallel computer system must be cost effective. In order to analyse the cost effectiveness of HiCs an existing measure was expanded to provide a more realistic model and a more accurate means of comparison. One aim of this thesis is to demonstrate the suitability of HiC for parallel computing systems which execute irregular algorithms requiring dynamic load balancing. A new dynamic load balancing algorithm is proposed which takes advantage of the hierarchical structure of the HiC to reduce communication overheads incurred when distributing work. To demonstrate performance of an irregular problem, a novel parallel algorithm was developed to detect subgraph isomorphism from many model graphs to a single input graph. The use of the new load balancing algorithm in conjunction with the subgraph isomorphism algorithm is discussed.
Showing items related by title, author, creator and subject.
Nordin, Syarifah Zyurina (2011)Task scheduling in parallel processing systems is one of the most challenging industrial problems. This problem typically arises in the manufacturing and service industries. The task scheduling problem is to determine a ...
Design and performance evaluation of a flexible clustering and allocation scheme for parallel processing.Chingchit, Soontorn (1999)Parallel processing is an important and popular aspect of computing and has been developed to meet the demands of high-performance computing applications. In terms of hardware, a large number of processors connected with ...
Primary control level of parallel distributed energy resources converters in system of multiple interconnected autonomous microgrids within self-healing networksShahnia, Farhad; Chandrasena, Ruwan; Rajakaruna, Sumedha; Ghosh, Arindam (2014)To minimise the number of load sheddings in a microgrid (MG) during autonomous operation, islanded neighbour MGs can be interconnected if they are on a self-healing network and an extra generation capacity is available ...