DiffStream: Differential Output Testing for Stream Processing Programs
Fri 20 Nov 2020 19:00 - 19:20 at SPLASH-I - F-1A Chair(s): Azalea Raad, Tongping Liu
High performance architectures for processing distributed data
streams, such as Flink, Spark Streaming, and Storm, are increasingly deployed in
emerging data-driven computing systems. Exploiting the parallelism
afforded by such platforms, while preserving the semantics of the
desired computation, is prone to errors, and motivates the development
of tools for specification, testing, and verification. We focus on the
problem of differential output testing for distributed stream processing
systems, that is, checking whether two implementations produce
equivalent output streams in response to a given input stream. The
notion of equivalence allows reordering of logically independent data
items, and the main technical contribution of the paper is an optimal
online algorithm for checking this equivalence. Our testing framework
is implemented as a library called DiffStream in Flink.
We present four case studies to illustrate how our framework can be
used to (1) correctly identify bugs in a set of benchmark MapReduce programs,
(2) facilitate the development of
difficult-to-parallelize high performance applications, and (3)
monitor an application for a long period of time with minimal performance overhead.
Fri 20 Nov Times are displayed in time zone: Central Time (US & Canada) change
07:00 - 08:20: F-1AOOPSLA at SPLASH-I +12h Chair(s): Diomidis SpinellisAthens University of Economics and Business, John WickersonImperial College London | |||
07:00 - 07:20 Talk | DiffStream: Differential Output Testing for Stream Processing Programs OOPSLA Konstantinos KallasUniversity of Pennsylvania, Filip NiksicGoogle, Caleb StanfordUniversity of Pennsylvania, Rajeev AlurUniversity of Pennsylvania Link to publication DOI Media Attached | ||
07:20 - 07:40 Talk | Pomsets with Preconditions: A Simple Model of Relaxed Memory OOPSLA Link to publication DOI Pre-print Media Attached | ||
07:40 - 08:00 Talk | StreamQL: A Query Language for Processing Streaming Time Series OOPSLA Link to publication DOI Media Attached | ||
08:00 - 08:20 Talk | Foundations of Empirical Memory Consistency Testing OOPSLA Jake KirkhamPrinceton University, Tyler SorensenUniversity of California at Santa Cruz, Esin TureciPrinceton University, Margaret MartonosiPrinceton University Link to publication DOI Media Attached |
19:00 - 20:20: F-1AOOPSLA at SPLASH-I Chair(s): Azalea RaadImperial College London, Tongping LiuUniversity of Massachusetts at Amherst | |||
19:00 - 19:20 Talk | DiffStream: Differential Output Testing for Stream Processing Programs OOPSLA Konstantinos KallasUniversity of Pennsylvania, Filip NiksicGoogle, Caleb StanfordUniversity of Pennsylvania, Rajeev AlurUniversity of Pennsylvania Link to publication DOI Media Attached | ||
19:20 - 19:40 Talk | Pomsets with Preconditions: A Simple Model of Relaxed Memory OOPSLA Link to publication DOI Pre-print Media Attached | ||
19:40 - 20:00 Talk | StreamQL: A Query Language for Processing Streaming Time Series OOPSLA Link to publication DOI Media Attached | ||
20:00 - 20:20 Talk | Foundations of Empirical Memory Consistency Testing OOPSLA Jake KirkhamPrinceton University, Tyler SorensenUniversity of California at Santa Cruz, Esin TureciPrinceton University, Margaret MartonosiPrinceton University Link to publication DOI Media Attached |