= ["Ramy", "Victorie", "Letty", "Robin", "Antoine", "Griffin"] list_exercise
Week 4: Intro to Python II: Scaling up your code - Iteration, Comprehension and Functions
You made several mistakes on the first question of the problem set. Unfortunately, you did not really considered this part of the question:
Your task: Create a folder in this repository named question-01 that follows the best practices for project management and reproducibility we discussed in class.
Mistakes:
No README!!!
Several files missing.
some are lack of attention on your part
some are because you are not considering the existence of a .gitignore file
Several files poorly named (white spaces, no clear explanation of the file, no numbering to order the code structure)
No correct folder structure
Most of you lost a lot of points in this question. I will give you an opportunity to recover some of these points.
Until the end of the day today, you can resubmit your Question 1 of problem set. Just do another push on github.
Fix all the issues, and you can earn up to 10 extra points.
Tip 1: Check the example repo on your notebook to see what your response should look like.
Tip 2: Create a readme.md file explaining your repository, describing your article, folder, and all the files. The readme should be inside the main folder
Tip 3: All the files have bad names. Rename them following best practices.
I will not give you these extra point in every problem set.
Questions? Ask your TA, ask me, ask on slack.
Not clear how to answer the question? Ask your TA, ask me, ask on slack.
Read the lecture notes.
Start solving your problem set early on.
Start slow with the in-class exercise from last week.
Scaling up your python skills:
Control statements (if, for and while loops)
Functions
Intro do Python - Part II.
Importing packages in Python
List Comprehension + Generators
File Management
Data as Nested Lists
Let’s practice with lists first. One way to explore data structures is to learn their methods. Check all the methods of a list by running ‘dir()’ on a list object. Let’s explore these functions using the following list object, by answering the below questions. See here for list methods:
Let’s do a similar exercise with Dictionaries. Consider the dictionary below. See here for dictionary methods:
Let’s now play around with some string methods. See the string below from the book “Babel:An Arcane History”. See here for string methods:
We will go over some concepts that are very general for any programming language.
Logical Operators: to make comparisons
Control statements: to control the behavior of your code
Iterations: repeat, repeat, scale-up!
User-Defined Functions: to make code more flexible, debuggable, and readable.
Operator | Property |
---|---|
== |
(value) equivalence |
> |
greater than |
< |
strictly less than |
<= |
less than or equal |
> |
strictly greater than |
>= |
greater than or equal |
!= |
Not Equals |
is |
object identity |
is not |
negated object identity |
in |
membership in |
not in |
negated membership in |
Any programming language needs statements that controls the sequence of execution of a particular piece of code. We will see three main types:
Definition: Conditional execution.
if <logical statement>:
~~~~ CODE ~~~~
elif <logical statement>:
~~~~ CODE ~~~~
else:
~~~~ CODE ~~~~
Definition: Taking one item at a time from a collection. We start at the beginning of the collection and mover through it until we reach the end.
In python, we can iterate over:
lists
strings
dictionaries items
file connections
grouped pandas df
write code sequentially to solve your immediate needs
reuse this code for similar tasks.
Have very long and repetitive codes
The code block above has the following elements:
In Python, you don’t need to assign an object to a function
The indentation blocks your statement. It replaces the curly braces
Scoping
lambda functions
10:00
For the second part of this lecture, we will see:
Importing libraries in Python
Comprehension and Generators
File management in Python
Data as Nested Lists
Python allows you to import libraries in a few different ways:
The full library with the original name
The full library with an alias
Some functions from the library
All methods from the library as independent functions
Elegant and cleaner way to perform iterations. Which means: a lot of people use it!
Automatically create new objects – no need to create a container in the loop
Flexible: allows working with lists, dictionaries, and sets
Faster than loops (but not much in a way that makes you avoid loops)
Python has this very nice data type called generators. We use these functions a lot, but hardly talk about them.
Purpose: Generators allow for generating a sequence of values at each time. In other words, it allows you to create iterators in Python.
Main Advantage: do not have to create the entire sequence at once and allocate memory
Lazy Evaluation: Returns a value at time. When requested. It is LAZY!!! We love LAZY!
You can build your own generators. That’s a bit advanced, and you probably will not need to use for our purposes. But we will see some pre-built “generators” that will be useful for us:
range()
: generate the corresponding sequence of integers. Commonly used with for loops.
zip()
: syncs two series of numbers up into tuples.
enumerate()
: generates an index and value tuple’s pairing
10:00
Most often we will use high-level functions from Pandas to load data into Python objects.
Why are we learning these tools then?
Very pythonic
No direct equivalent in R or Stata
Important when working non-tabular data - text, json, images, etc..
Reading: Check Section 3.3 of Python for Data Analysis to learn more about the topics covered in the notebook.
open(): opens a connection with files on our system.
close(): closes the connection.
write(): writes files on your system. Also line by line.
with(): wrapper for open and close that allows alias.