Instructions for PPOL 6801 - Final Project
Data Science and Public Policy, McCourt School for Public Polcy, Georgetown University
General Instructions
Data science is an applied field and the DSPP is particularly geared towards providing students the tools to make policy and substantive contributions using data and recent computational developments.
In this sense, it is fundamental that you understand how to conduct a complete analysis from collecting data, to cleaning and analyzing it, to presenting your findings.
For this reason, a considerable part of your grade (40%) will come from a an independent data science project, applying concepts learned throughout the course.
The project is composed of three parts:
- a 2 page project proposal: (which should be discussed and approved by me)
- an in-class presentation,
- A 10-page project report.
Due dates and breakdowns for the project are as follows:
Requirement | Due | Length | Percentage |
---|---|---|---|
Project Proposal | EOD Friday Week 9 | 2 pages | 5% |
Presentation | Week 14 | 10-15 minutes | 10% |
Project Report | Wednesday Week 15 | 10 pages | 25% |
All components of the project:
Project Proposal
The project proposal asks that you sketch out a general 2 page (single-spaced; 12pt font) project proposal. The proposal should offer the following information:
A high-level statement of the problem you intend to address or the analysis you aim to generate
The data source(s) you intend to use
Your plan to obtain that data
The text as data method you plan to use.
A definition for what “success” means with respect to your project.
- In your words, what would a successful project look like? What is the results you want to show? What would be a surprising finding for you?
Presentation
You will have the opportunity to present you final project in class. As a Data scientists, your presentation skills are as important as your data skills. You should prepare a 10-12 minutes presentation. It should cover:
Motivation or problem statement
Data collection
Methods
Results
Lessons learned and next steps
Project Report
The report is a complete description of the project’s analysis and results. The report should be a 10 page in length (single-spaced; 12pt font) and cover the points presented below. Citations and Front page will not be considered in the 10 page.
Below I’ve outlined points that one should aim to discuss in each section. Note that paper should read as a cohesive report:
Introduction
- summarize your motivation, present some previous work related to your question
- present your research question
- summarize your report
Data and Methods
- Where does the data come from? – What is the unit of observation? – What are the variables of interest? – What steps did you take to wrangle the data?
Analysis
- Describe the methods/tools you explored in your project.
Results
- Give a detailed summary of your results. – Present your results clearly and concisely. – Please use visualizations instead of tables whenever possible.
Discussion
- Re-introduce your main results
- State your contributions
- Where do you want take this project next?
IMPORTANT: There is not literature review section. You can use the literature to motivate your work. But you don’t need to have a full section on literature review. Read for example, papers at main general interest journals, like here, here and here. All these super accomplished articles do not have long literature reviews.
Submission of the Final Project
The end product should be a github repository that contains:
The raw source data you used for the project. If the data is too large for GitHub, talk with me, and we will find a solution
Your proposal
A README for the repository that, for each file, describes in detail:
Inputs to the file: e.g., raw data; a file containing credentials needed to access an API
What the file does: describe major transformations.
Output: if the file produces any outputs (e.g., a cleaned dataset; a figure or graph).
A set of code files that transform that data into a form usable to answer the question you have posed in your descriptive research proposal.
Your final 10 pages report (I will share a template later in the semester)
Of course, no commits after the due date will be considered in the assessment.
Templates for writing
Here, I suggest you with a few templates for writing your reports. In my experience, writing data science reports using LaTeX is an gain in productivity in the long-run.
If you want to experiment with LaTeX via overleaf, you should use this template from PNAS (Just remember to switch to a one-column template):
You can also use the Journal of Quantitative Description templates in Word Doc or LaTeX:
Or, you can use templates from Quarto and write you entire project using quarto or markdown files