Project #3 - Final Project

Overview

The main portion of a student's grade in this course is the final group project. Students will organize into groups and choose to implement a project that is (1) relevant to the materials discussed in class, (2) requires a significant programming effort from all team members, and (3) unique (i.e., two groups may not choose the same project topic). The projects will vary in both scope and topic, but they must satisfy this criteria. We will discuss this more in depth during class, though students are encouraged to begin to think about projects that interest them early on. If a group is unable to come up with their own project idea, the instructor will provide suggestions on interesting topics.

Each project is comprised of four tasks that are due at different times during the semester:

  • Proposal Presentation: Each group will present their project topic to the class.
  • Status Meeting: Each group will meet with the instructor to discuss their plans for the project update presentation.
  • Project Update Presentation: Each group will provide a brief update to the class about the current status of their project.
  • Code Reviews: Each group is required to review the code of another group and provide feedback on correctness, coding style, and assumptions.
  • Final Presentation: All projects are required to provide documentation (both in the source code and in "wiki-style" format).
  • Code Drop: All projects are required to provide documentation (both in the source code and in "wiki-style" format).

All projects must be implemented in the Peloton DBMS. At a high-level, each project consists of three implementation tasks. The first is the actual implementation of the proposed idea in Peloton. The second is the set of unit and regression tests that they will use to check whether their implementation is correct. The final piece is the evaluation of their implementation to determine how will the DBMS performs with it. Students also encouraged to use the workloads that are already provided by the OLTP-Bench framework for this last part of their project.

Each group must use a single Github repository for all development. Everyone will be provided with an account on the CMU Database Group build and test servers.

A project is not considered complete until the instructor has signed off on the submission.

  • Release Date: Mar 07, 2017
  • Due Date: May 15, 2017 @ 11:59pm

Proposal Presentation (Due Date: Mar 21, 2017)

Each group will give a 5 minute presentation about their proposed project topic to the class. This proposal should contain the following information:

  • An overview of what work must be done and how it will be divided amongst the group.
  • A estimation on what files you will need to modify in the DBMS.
  • The tests that you will write to validate that your project is correct and the experiments that you will use to measure its performance.
  • The resources you will need to complete the project. This includes software, hardware, data sets, or workloads.

Your proposal should also provide three types of goals: 75% goals, 100% goals, and 125% goals. Think of these as the equivalent of a B grade, an A grade, and a "wow!" grade. The goals can be dependent or independent of the prior goals. Each group can meet individually with the instructor afterwards for additional discussion and clarification of the project idea.

Each group should email the instructor a PDF version of their presentation after class.

Status Meeting (Due Date: Apr 13, 2017)

Each group will meet with the instructor in private and discuss the current status of the project. This will be a preview of the group's status update presentation in the subsequent class. Students should bring up any unexpected challenges or issues with their project implementation.

Project Update Presentation (Due Date: Apr 18, 2017)

Each group will provide a brief update to the class half way through the project on the the current status of their implementation. The update presentation should contain the following information:

  • An overview of the development status of their project as related to the goals discussed in the initial proposal.
  • Any information about whether the groups' original plans have changed and an explanation as to why.
  • A measurement of the current code coverage of the tests for your implementation.
  • Color commentary about any surprises or unexpected issues that the group encountered during coding.

The goal of this exercise is to make sure that everyone in the class is aware of what the other groups are working on and how far along they are in the process. That way if one group has worked on part of the system that another group still needs to investigate, then they can talk to each other and share knowledge.

Code Reviews (Due Dates: Multiple)

Each group will be paired with two other groups and provide feedback on their code. The development group (i.e., the group that implemented the project) will provide the reviewing group with two things: (1) a pull request on Github with the core changes for their project and (2) a brief summary of what files/functions they want the reviewing group to examine. The reviewing group will also need to post their pull request URL on the course spreadsheet.

Review #1

  • Pull Request Date: April 11th, 2017 @ 11:59pm
  • Review Date: April 18th, 2017 @ 11:59pm

Review #2

  • Pull Request Date: May 4th, 2017 @ 11:59pm
  • Review Date: May 11th, 2017 @ 11:59pm

The Pull Request Date due date is when the development group should provide the reviewing group their pull request. The Review Date is when the reviewing group must complete their review and provide feedback. The development group will then have until either the next Code Review due date (May 4th) or the Final Code Drop due date (May 11th) to update their project in response to the last code review.

The code reviews do not need to be all done exactly on the due date but they must be done by the due date. The groups are free to schedule with each other when they are ready for the review. The grading for this will be based on participation in terms of both providing a useful review to other students as well as incorporating the feedback into their implementation. The review will be completed on Github.

Each group should consider the following questions when examining the code:

General Questions

  • Does the code work?
  • Is all the code easily understood?
  • Is there any redundant or duplicate code?
  • Is the code as modular as possible?
  • Can any global variables be replaced?
  • Is there any commented out code?
  • Is it using proper debug log functions?

Documentation Questions

  • Do comments describe the intent of the code?
  • Are all functions commented?
  • Is any unusual behavior described?
  • Is the use of 3rd-party libraries documented?
  • Is there any incomplete code?

Testing Questions

  • Do tests exist and are they comprehensive?
  • Are the tests actually testing the feature?
  • Are they relying on hardcoded answers?
  • What is the code coverage?

Final Presentation (Due Date: May 9, 2017 @ 5:30pm)

During the scheduled final exam period for the course, each group will do 10 minute presentation on the final status of their project. This presentation should contain the following information:

  • A re-iteration of your proposed goals, with explicit discussion about what progress you have made to date on those goals
  • A discussion of how you tested the correctness of your implementation.
  • An assessment on the quality of your code. Feel free to discuss what parts of your implementation you felt are particularly strong and what parts would need more work to bring up to production-quality code.
  • Any benchmark results that the group collected to measure the performance of their implementation.
  • An outline of concrete tasks for future work to expand or improve your implementation.

More information about this presentation will be discussed in class.

Final Code Drop (Due Date: May 15, 2017)

The final task is for each group to submit a pull request to the master branch of Peloton. A project is not considered complete until all of the following requirements are satisfied:

  • The code can merge into the master branch without any conflicts.
  • All comments from code review are addressed.
  • The project includes test cases that correctly verify that implementation is correct.
  • The group provides documentation in both the source code and in separate Markdown files.

Each group will be assigned a random position in the merge train. They will need to merge their code into the latest version of master branch (i.e., they will need to be able to merge their code into a version of the branch that includes updates from the previous group).

External Code & Libraries

Before a group can use a third-party source code or libraries for their project implementation, they must first get approval from the instructor. Peloton has specific protocols for including the source code of external projects and libraries that the group must follow in their implementation.

In general, a group is only allowed to incorporate external source code into the Peloton code base if (1) it is not provided as a Debian package and (2) it is Apache Software License compatible (e.g., BSD, MIT license). GPL code will not be allowed.

Collaboration Policy

  • Everyone has to work in a team of three people for this assignment.
  • Groups are allowed to discuss high-level details about the project with others.

WARNING: All of the code for this project must be your own. You may not copy source code from other groups or other sources that you find on the web. Plagiarism will not be tolerated. See CMU's Policy on Academic Integrity for additional information.