TalkSaile0711

Talk

This talk is the public defense of the diploma thesis by David Saile.

Also, the talk contributes the second half of the PTT lecture on data-parallel programming.

Speaker

David Saile (University of Koblenz-Landau)

Title

MapReduce with Deltas

Host

Ralf Lämmel

Software Languages Team

University of Koblenz-Landau

Department of Computer Science

Room

E 011

Campus Koblenz

Date

28 June 2011

Time

1.00 pm

Abstract

The MapReduce programming model is extended conservatively to deal with deltas for input data such that recurrent MapReduce computations can be more efficient for the case of input data that changes only slightly over time. That is, the extended model enables more frequent re-execution of MapReduce computations and thereby more up-to-date results in practical applications. Deltas can also be pushed through pipelines of MapReduce computations. The achievable speedup is analyzed and found to be highly predictable. The approach has been implemented in Hadoop, and a code distribution is available online. The correctness of the extended programming model relies on a simple algebraic argument.

Links

The diploma thesis has led to the following publication: