Course Project (G)

Only graduate students are required to complete a course project. In the Content section I will provide two suggested topics. You may also propose your own. Expect this document to be updated periodically throughout the course.


Updated 9/27/21. I have reduced the number of project presentation days from the original number in the schedule.

Update 10/31/21. Please see this page for the midterm presentation and report guidelines.

Workshops and Presentations

On workshop days, your presentation may take a variety of forms, including but not limited to:

Initial Project Presentation. Every person/group will give a 5 min presentation introducing their course project to the class on Oct 18. At this point, you will have received feedback from me and should be able to identify the relevant systems, tasks, and environments under study, as well as the major research question you will address.

Presenting related material. On days marked "Workshop" one group/person will present. Early in your project work, you may want to focus on presenting related work that helps to contextualize what you are doing. This is the sort of material you might cite in an introduction, background section, or related work section of paper. Earlier in the semester, this work will likely focus on justifying your project as research, establishing the problem space as relevant and open. By mid-semester, you will be sufficiently into the details of your work that you may want to spend time during class giving background on a critical method. For example, if your work relies heavily on agent-based simulation, you might want to use this time to present agent-based simulation as an evaluation tool.

Presenting current progress. You may present your progress on a variety of tasks, including but not limited to software infrastructure or corpora that you are building, findings such as empirical analyses or proof results, or reports on challenges you are facing. You may be asked during your presentation how this work relates to your motivating research question; be prepared to answer.

Presenting negative results. You should feel free to present any negative results; this may range from experiments that failed to disprove hypotheses to engineering approaches that failed to do the thing you'd hoped they'd do. I would encourage you to use class time to work through any of these problems with the class, and to try to brainstorm alternative approaches.

Group members

You may work with a partner on your project, but not in groups of more than two. You may optionally include any number of undergraduates on your project and they will not count towards your group size.

Help selecting a topic

To facilitate project discussion, I will have extra availability:

  • Wednesday, Sep. 29, Innovation E456, 1-5pm
  • Thursday, Sep. 30, Teams only, 1-5pm
  • Friday, Oct. 1, Innovation E456 1-5pm
  • Monday, Oct. 4, Innovation E456 1-5pm
  • Tuesday, Oct. 5, Teams only, 1-5pm

Please feel free to stop by my my office or drop into Teams! This is your time.

If you are looking for inspiration, I recommend skimming work that has appeared at relevant venues (e.g., KDD, MLSys, SIGMOD/PODS for the highest concentration of data+systems papers). Make sure to cite any relevant papers you read or are inspired by in your project proposals.


Your project proposals will have two major components: a topic or system you propose to study and a set of three possible research questions you'd like to answer by the end of the semester. My feedback will focus on two things: (1) whether the topic is within scope of the course and (2) whether your research questions are sufficiently scoped.

Your research topic should, in some way, reflect the goals and paradigm of knowledge discovery and have a systems element. I expect every project to have a data element, but you do not actively need to be performing data analysis in your project.

Example topics

Below are several suggestions for example topics; I've linked example project proposals.

I will also accept topics that have a systems for machine learning or systems for data science angle. I will be providing feedback on all project proposals, and you should feel free to reach out to me in order to workshop project ideas.

Scope of research questions

I expect you to propose a project that can be accomplished with 9 weeks of work. This means that your project needs to be small, since you will likely face unforeseen challenges along the way.

In your initial proposal, I expect you to suggest three possible research questions you might seek to answer. For each research question, I'd like to see at least one hypothesis you would like to evaluate.

For the research question and hypothesis pairing you believe to have the lowest risk,1 you should hand in a 9-week plan that breaks down the project into its constituent parts, mapping out when you expect to complete those parts. If you are unsure of how to get started with this, come to my extended student hours.

I expect you to begin working on your project when you turn in your initial proposal (Oct. 8). I recommend investing heavily in the first week of project work (i.e., the week between Oct. 8 and revisions on Oct. 15) to help you determine whether your instincts about the lowest-risk project were correct.

  1. Generally it is important to take risks in research. I absolutely want to encourage risky research ideas. However, the risk I want to mitigate here is likelihood of project failure. That means that you should ensure that the project leverages skills you already have. Project risk is different from research risk. For example, suppose you are a very strong programmer and want to build a novel system as part of your project. The underlying research question might be risky, but you have mitigated risk of project failure by focusing on system-building. If you are not a strong programmer and your proposed plan relies on a rate of programming progress that maps to a greater proficiciency than you already have, your risk of failure increases and your risk of unhappiness, stress, and sleeplessness skyrockets and no one wants that