Collective Communication for Parallel and Distributed Processing

Supported in part by NSF grant CCR-9503838
Principal Investigator:
Philip K. McKinley
Department of Computer Science
Michigan State University

Project Description. High-performance computing has undergone many changes in recent years: trends include massively parallel processors (MPPs), local networks of workstations (NOWs), and even Internet-based parallel processing. A critical component in all such systems is the network through which processes communicate, including both the physical network architecture and the associated communication protocols. Communication operations among processes may be either point-to-point, which involves a single source and a single destination, or collective, in which more than two processes participate. Collective operations include multicast, broadcast, global combine, and barrier synchronization. Collective communication operations are important to parallel and distributed applications for data distribution, global processing of distributed data, and process synchronization.

This research project addresses several issues in collective communication as used in parallel and distributed processing, including: design of low-level architectural support for collective operations, development of algorithms to implement collective operations efficiently on various platforms, and the use of efficient collective operations to improve the performance of specific applications. Much of our work has focused on the design of collective communication operations for wormhole-routed networks. In particular, we developed an Extended Dominating Node (EDN) model, which we used to construct various collective algorithms for multiport wormhole systems. We also developed a Multicast Virtual Topology (MVT) model, which we used to construct collective algorithms for both wormhole systems and NOWs that use cut-through switching. Our NOW-based collective communication work includes solutions that are process-based, thread-based, as well as support for multicast operations within network interface cards. All these operations exploit the underlying characteristics of the specific networks in order to improve performance. Our ongoing investigations focus primarily on simulation and experimental studies of collective operations on a variety of distributed-memory platforms.