MSR course SS 2016: Paper study on "validation"


Deeply analyse a MSR paper to understand the methodology and the challenges of validation in MSR


  • Paper to student assignment during kick-off meeting on Tuesday, 19 April
  • Mandatory student consultations with Hakan Aksu on Wednesday, 27 April
  • Student presentations during course slot on Tuesday, 10 May
  • Submit .pdf and sources of your presentation to svn:


You should try to address these questions:

  • What is the overall MSR scenario at hand? (1 slide)
  • What are the research questions? (1 slide)
  • What methodology is used by this research? (1 slide)
  • How is validation or evaluation accomplished (1 slide)
  • What are the results of the reported research? (1 slide)
  • How does the discussion of results connect back to research questions? (1 slide)
  • What are the reported or otherwise observable threats to validity? (1+ slide)
  • Is the research reproducible? (1 slide)
  • More on validation (1+ slides), e.g.:
    • What do you think is difficult in doing this kind of validation?
    • Do you consider the validation as convincing/strong enough?
    • Can you think of alternative means of validation?
  • How could we adopt ideas from the paper to the running topics of the course? (1 slide)

It is possible that not all of these questions are easily applicable to the paper of yours. In this case, you may want to get in touch with the course staff and negotiate a slight variation. The last two questions (more on validation and adoption to the running topics) are very important.

To get some basic background on empirical research, consider this paper: Selecting Empirical Methods for Software Engineering Research by Steve Easterbrook, Janice Singer, Margaret-Anne Storey, and Daniela Damian. Technical Report. 2007. Available online in (.pdf). For instance, in this manner, you will understand the notions of "research question" and see different empirical methods such as "controlled experiments".


  • The consultation period is 18:00-19:15 in the room assigned to the room. You are loosely "treated" in the order of the student-to-paper assignment below. So not everyone needs to be there at 18:00 firm. It is Ok to be in the room for the consultation slots of other students because you might want to learn from what the discussion is like with the other students.
  • Perhaps, you can just use annotation to highlight paragraphs in your paper's PDF as to how paragraphs correspond to the question raised above? You are more than welcome to bring some draft slide deck for your emerging presentation. Obviously, not everything will be done by the time of consultation.
  • Make sure you checked that you seem to be able to answer the questions eventually. Otherwise, ask for help on specific aspects. Make sure to ask "intelligent" questions. That is, you should have done some reading and thinking before you show up for consultation.
  • The consultation is mandatory, which means you should figure out with the teaching staff whether an additional consultation opportunity can be arranged. In fact, in this case, you should probably submit a draft PDF to svn which shows the teaching staff where you stand and whether you have open issues.


Assignment of students to papers happens during kick-off meeting.

Audris Mockus, James D. Herbsleb:
Expertise browser: a quantitative approach to identifying expertise. (Student Paul)

Cédric Teyton, Marc Palyart, Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc:
Automatic extraction of developer expertise. (Student Schmidt)

Renuka Sindhgatta:
Identifying domain expertise of developers from source code. (Student Hauck)

Gianluca Demartini:
Finding Experts Using Wikipedia. (Student Nikonov)

John Anvik, Lyndon Hiew, Gail C. Murphy:
Who should fix this bug? (Student Klöckner)

Motahareh Bahrami Zanjani, Huzefa H. Kagdi, Christian Bird:
Using Developer-Interaction Trails to Triage Change Requests. (Student Monschau)

Yuhao Wu, Yuki Manabe, Tetsuya Kanda, Daniel M. Germán, Katsuro Inoue:
A Method to Detect License Inconsistencies in Large-Scale Open Source Projects. (Student Beck)

Gregorio Robles, Jesús M. González-Barahona, Carlos Cervigón, Andrea Capiluppi, Daniel Izquierdo-Cortazar:
Estimating development effort in Free/Open source software projects by mining software repositories: a case study of OpenStack. (Student Theisen)

Hoda Naguib, Nitesh Narayan, Bernd Brügge, Dina Helal:
Bug report assignee recommendation using activity profiles. (Student Hartenfels)

Miltiadis Allamanis, Charles A. Sutton:
Mining source code repositories at massive scale using language modeling. (Student Brack)

Foutse Khomh, Tejinder Dhaliwal, Ying Zou, Bram Adams:
Do faster releases improve software quality? An empirical case study of Mozilla Firefox. (Student Rüther)