This talk is the public defense of the diploma thesis by David Saile.
Also, the talk contributes the second half of the PTT lecture on data-parallel programming.
David Saile (University of Koblenz-Landau)
MapReduce with Deltas
Software Languages Team
University of Koblenz-Landau
Department of Computer Science
28 June 2011
The MapReduce programming model is extended conservatively to deal with deltas for input data such that recurrent MapReduce computations can be more efficient for the case of input data that changes only slightly over time. That is, the extended model enables more frequent re-execution of MapReduce computations and thereby more up-to-date results in practical applications. Deltas can also be pushed through pipelines of MapReduce computations. The achievable speedup is analyzed and found to be highly predictable. The approach has been implemented in Hadoop, and a code distribution is available online. The correctness of the extended programming model relies on a simple algebraic argument.
The diploma thesis has led to the following publication: