Talk
This talk is the public defense of the diploma thesis by David Saile.
Also, the talk contributes the second half of the PTT lecture on data-parallel programming.
Speaker
David Saile (University of Koblenz-Landau)
Title
MapReduce with Deltas
Host
Ralf Lämmel
Software Languages Team
University of Koblenz-Landau
Department of Computer Science
Room
E 011
Campus Koblenz
Date
28 June 2011
Time
1.00 pm
Abstract
The MapReduce programming model is extended conservatively to deal with deltas for input data such that recurrent MapReduce computations can be more efficient for the case of input data that changes only slightly over time. That is, the extended model enables more frequent re-execution of MapReduce computations and thereby more up-to-date results in practical applications. Deltas can also be pushed through pipelines of MapReduce computations. The achievable speedup is analyzed and found to be highly predictable. The approach has been implemented in Hadoop, and a code distribution is available online. The correctness of the extended programming model relies on a simple algebraic argument.
Links
The diploma thesis has led to the following publication: