Tuesday Panel: Language support for parallel and distributed computing
Language constructs that support parallel computing are relatively well recognized at this point, with features such as parallel loops (optionally with reduction operators), divide-and-conquer parallelism, and general parallel blocks. But what language features would make distributed computing safer and more productive? Is it helpful to be able to specify on what node a computation should take place, and on what node data should reside, or is that overspecification? We don’t normally expect a user of a parallel programming language to specify what core is used for a given iteration of a loop, nor which data should be moved into which core’s cache. Generally the compiler and the run-time manage the allocation of cores, and the hardware worries about the cache. But in a distributed world, communication costs can easily outweigh computation costs in a poorly designed application. This panel will discuss various language features, some of which already exist to support parallel computing, and how they could be enhanced or generalized to support distributed computing safely and efficiently.
Our panel members are familiar with many of these issues:
-
Kyle Chard, University of Chicago and Argonne National Laboratory: “The past decade has seen a major transformation in the nature of programming as the need to make efficient use of parallel hardware is now inescapable. As the parallel programming community both grows and becomes yet more diverse there is a crucial need for high level language features that enable productivity, portability, safety, and usability. To strike a balance between usability and performance we need to focus on ways to raise the level of abstraction, making parallelism more accessible to developers in their working environments, and automating complex runtime decisions where possible, even if this comes at the expense of performance and/or functionality.”
-
James Munns, founder Ferrous Systems: “I can speak broadly around Rust’s capability to make certain aspects easier, such as serialization, state handling, error management, etc. Good distributed computing relies on safe and effective concurrent computing, so Rust’s features such as the Rayon library for light-weight threading, as well as Rust’s more conventional heavy-weight threading support, provide a basis for moving into the distributed computing realm.”
-
Richard Wai, founder Annexi-Strayline: “The rapidly changing and diverse space of distributed computing imposes complex challenges, particularly to language-defined specification of behavior. We should consider what safety threats arise from high communication costs. The real safety threat may be in the management and coordination of a large distributed codebase, where changes in one partition could potentially propagate serious defects out into the larger system, with unpredictable outcomes. There also seems to be a movement towards expanding the NUMA concept (or COMA) to distributed systems through rDMA fabrics and other similar architectures. This could mean a future where heterogenous systems share a cache-coherent global address space. We should consider how languages might scale to such system architectures, particularly in the parallel processing domain. How might a parallel loop behave over a cache-coherent fabric - particularly if the elements of the iterated data are disbursed?”
-
Tucker Taft (moderator), VP and Director of Language Research, AdaCore: “My career has been focused on the design of programming languages that can enhance the effectiveness and productivity of developers building large, high-performance, safe, secure, correct, and often real-time software-intensive systems. In the mean time, the hardware world has moved from relatively simple, single-processor, single-machine systems, through multi-core and many-core machines, on to heterogeneous and distributed networks of multi-core nodes with GPUs and FPGAs, cooperating to solve otherwise intractable problems. Programming languages have lagged behind this evolution, meaning that today’s programmer is generally confronted with all of this complexity. In some sense we have lost our high-level languages for developing software for these new systems, and are effectively back to doing machine-level programming, where now we worry about individual messages and data placement, much like the old assembly languages where we worried about individual machine instructions and machine registers. The question is can we regain a high-level model for doing distributed computing, but still achieve the performance achievable by “machine-level” distributed computing?”
Tuesday HILT zoom room – Tuesday HILT YouTube – HILT Clowdr Break Room