SPLASH 2020
Sun 15 - Sat 21 November 2020 Online Conference
Mon 16 Nov 2020 11:30 - 12:00 at SPLASH-V - HILT workshop Monday sessions Chair(s): Tucker Taft, Luis Miguel Pinho, Richard Wai

This presentation gives an overview of Tapir, which embeds recursive fork-join parallelism, as supported by task-parallel programming platforms such as Cilk and OpenMP, into a mainstream compiler’s IR. Mainstream compilers typically treat parallel linguistic constructs as syntactic sugar for function calls into a parallel runtime. These calls prevent the compiler from performing optimizations on and across parallel control constructs. Remedying this situation has generally been thought to require an extensive reworking of compiler analyses and code transformations to handle parallel semantics. Tapir enables effective compiler optimization of recursive fork-join programs with only minor changes to existing compiler analyses and code transformations.

Tapir leverages the “serial-projection property,” which is commonly satisfied by recursive fork-join programs, to handle the semantics of these programs without an extensive rework of the compiler. Tapir uses the serial-projection property to order logically parallel fine-grained tasks in the program’s control-flow graph. This ordered representation of parallel tasks allows the compiler to optimize parallel codes effectively with only minor modifications. For example, to implement Tapir/LLVM, a prototype of Tapir in the LLVM compiler, we added or modified less than 300030003000 lines of the half-million-line core middle-end functionality in LLVM version~6.

These changes suffice to enable LLVM’s existing compiler optimizations for serial code — including loop-invariant-code motion, common-subexpression elimination, and tail-recursion elimination — to work with parallel control constructs such as parallel loops and Cilk’s cilk_spawn keyword. By making use of existing LLVM optimizations and new parallel optimizations, Tapir/LLVM can optimize recursive fork-join programs more effectively than traditional compilation methods. On a suite of 353535 Cilk application benchmarks, Tapir/LLVM produces more efficient executables for 303030 benchmarks, with faster 181818-core running times for 262626 of them, compared to a nearly identical compiler that compiles parallel linguistic constructs the traditional way. In addition, by integrating Tapir/LLVM into the Accelerated Linear Algebra (XLA) compiler in Google’s TensorFlow machine-learning framework, Tapir/LLVM enables more effective compiler optimization of a variety of neural networks written in TensorFlow, improving the parallel performance of these networks by a geometric-mean multiplicative factor of 30%–-100% across a variety of CPU hardware architectures.

This presentation will overview how Tapir supports compiler optimization of recursive fork-join programs. The talk will examine Tapir and how it enables compiler optimization of parallel programs using existing compiler optimizations for serial code and new parallel optimizations, such as loop scheduling and loop stripmining, that restructure the parallel control flow of the program. The talk will describe how Tapir enables efficient and effective debugging of compiler optimizations for parallel programs. The talk will discuss some theoretical foundations for compiler optimizations based on Tapir.

Monday HILT zoom roomMonday HILT YouTubeHILT Clowdr Break Room

Mon 16 Nov
Times are displayed in time zone: Central Time (US & Canada) change

09:00 - 13:00: HILT workshop Monday sessionsHILT at SPLASH-V
Chair(s): Tucker TaftAdaCore, United States, Luis Miguel PinhoISEP P.PORTO, Richard WaiANNEXI-STRAYLINE

Monday HILT zoom roomMonday HILT YouTubeHILT Clowdr Break Room

09:00 - 10:00
Keynote
HILT
Michael KlemmOpenMP ARB
10:00 - 10:15
Coffee break
HILT
10:15 - 10:45
Talk
HILT
Tucker TaftAdaCore, United States
10:45 - 11:15
Talk
HILT
Jan VerscheldeUniversity of Illinois at Chicago
11:15 - 11:30
Coffee break
HILT
11:30 - 12:00
Talk
HILT
TB SchardlMIT CSAIL
12:00 - 13:00
Other
HILT
P: Sara RoyuelaBarcelona Supercomputing Center, P: Michael KlemmOpenMP ARB, P: Eduardo QuiñonesBarcelona Supercomputing Center (BSC), P: Tucker TaftAdaCore, United States, P: Dirk ZiegenbeinRobert Bosch GmbH