PPOL 5203 Data Science I: Foundations

Collecting Digital Data - API

Tiago Ventura

Learning Goals¶

In the class today, we will learn how to collect digital data through APIs. We will focus on:

Building a solid understanding about APIs
Working with three types of APIs:
- APIs with no credentials and no wrappers
- APIs with credentials and no wrappers
- APIs with wrappers.

# setup
import requests
import os
import pandas as pd

APIs 101¶

The famous acronym API stands for “Application Programming Interface”. An API is an online server allows different applications to interact. Most often for our purposes, an API will facilitate information exchange between data users and the holders of certain data. Many companies build these repositories for various functions, including sharing data, receiving data, joint database management, and providing artificial intelligence functions or machines for public use.

Let's think of an example capable of motivating the creation of an API. Imagine you own Twitter. You would have zillions of hackers every day trying to scrape your data, this would make your website more unstable and insecure. What is a possible solution? You create an API, and you control who accesses the information, when they access it, and what type of information you make available. Another option is to close you API and restrict data access to researchers. But, if you do this, you are likely to pay a reputational cost for not being transparent, and users might leave your platform.

Have you ever watched Matrix? APIs are just like that! In the movies, Neil and others would physically connect their mindes to a super developed server and ask to learn a certain skill - kung-fu, programming, language, etc. This is exactly what an API does. You connect to the website and request data, and receive it in return. It's like sending an email, but doing everything via programming language.

API Use-Cases¶

There are two main ways in which we academics commonly use APIs.

Access data shared by Companies and NGOs.
Process our data in Algorithms developed by third parties.

Our focus will be on the first. Later, we will see how to use the ChatGPT API for text classification tasks.

APIs Components¶

An API is just an URL. See the example below:

http://mywebsite.com/endpoint?key&param_1&param_2

Main Components:

http://mywebsite/: API root. The domain of your api/
endpoint: An endpoint is a server route for retrieving specific data from an API
key: credentials that some websites ask for you to create before you can query the api.
?param_1*param_2 parameters. Those are filters that you can input in apis requests.

Requests to APIs¶

In order to work with APIs, we need tools to access the web. In Python, the most common library for making requests and working with APIs is the requests library. There are two main types of requests:

get(): to receive information from the API -- which we will use the most for web data collection
post(): to send information to the API -- think about the use of ChatGPT for classification of text.

Example 1: Open Trivia API¶

Querying an API: Step-by-Step¶

Let's start querying our first API. We will start with the simple Open Trivia API. This is a very simple API, and serves the purpose of learning all the basic steps of querying APIs. The Open Trivia API gives you ideas for your trivia games!

The Trivia API:

Does not require us to create credentials.
And does not have a Python wrapper.

When querying an API, our work will often involve the following steps:

Step 1: Look at the API documentation and endpoints, and construct a query of interest
Step 2: Use requests.get(querystring) to call the API
Step 3: Examine the response
Step 4: Extract your data and save it.

Step 1: Documentation, Endpoints and Query.¶

Before we start querying an API, we always need to read through the documentation/reference. The documentation often revel to us:

The base url for the API: https://opentdb.com/api_config.php
The different endpoints:
- This api has only one endpoint
The API parameters:
- amount
- category
- And some others we will learn.

Notice one thing here. The Trivia API requires you to gave the amount filter in your call. Not all APIs are like this. Some have a random api endpoint for you to play around

# build query
query = "https://opentdb.com/api.php?amount=1"

Step 2: Use `requests.get(querystring)` to call the API¶

To interact with the API, we will use the requests package. The requests package allow us to send a HTTP request to the API. Because we are intereste in retrieving data, we will mostly be working with the .get() method, which requires one argument — the URL we want to make the request to.

When we make a request, the response from the API comes with a response code which tells us whether our request was successful. Response codes are important because they immediately tell us if something went wrong.

To make a ‘GET’ request, we’ll use the requests.get() function, which requires one argument — the URL we want to make the request to. We’ll start by making a request to an API endpoint that doesn’t exist, so we can see what that response code looks like

# Make a get request to get the latest position of the ISS from the OpenNotify API.
response = requests.get(query)
type(response)

requests.models.Response

Step 3: Examine the response¶

When we make a request, the response from the API comes with a response code which tells us whether our request was successful. Response codes are important because they immediately tell us if something went wrong. Here is a list of response codes you can get

200 — Everything went okay, and the server returned a result (if any).

301 — The server is redirecting you to a different endpoint. This can happen when a company switches domain names, or when an endpoint's name has changed.

401 — The server thinks you're not authenticated. This happens when you don't send the right credentials to access an API.

400 — The server thinks you made a bad request. This can happen when you don't send the information that the API requires to process your request (among other things).

403 — The resource you're trying to access is forbidden, and you don't have the right permissions to see it.

404 — The server didn't find the resource you tried to access.

# check status code
status_code = response.status_code

# print status code
status_code

200

Step 4: Extract your data.¶

With an 200 code, we can access the content of the get request. The return from the API is stored as a content attribute in the response object.

print(response.content)

b'{"response_code":0,"results":[{"type":"multiple","difficulty":"medium","category":"Entertainment: Video Games","question":"By how many minutes are you late to work in &quot;Half-Life&quot;?","correct_answer":"30","incorrect_answers":["5","60","15"]}]}'

Processing JSONs¶

The deafault data type we receive from APIS are in the JSON format. This format encodes data structures like lists and dictionaries as strings to ensure that machines can read them easily.

For that kind of content, the requests library includes a specific .json() method that you can use to immediately convert the API bytes response into a Python data structure, in general a nested dictionary.

# convert the get output to a dictionary
response_dict = response.json()
print(response_dict)

{'response_code': 0, 'results': [{'type': 'multiple', 'difficulty': 'medium', 'category': 'Entertainment: Video Games', 'question': 'By how many minutes are you late to work in &quot;Half-Life&quot;?', 'correct_answer': '30', 'incorrect_answers': ['5', '60', '15']}]}

# index just like a dict
response_dict["results"][0]["question"]

'By how many minutes are you late to work in &quot;Half-Life&quot;?'

# convert to a dataframe
import pandas as pd

# need to convert to a list for weird python reasons
pd.DataFrame([response_dict["results"][0]])

Let's see the full code:

# full code
import requests
import pandas as pd

# build query
query = "https://opentdb.com/api.php?amount=1"

# 
response = requests.get(query)

# check status code
status_code = response.status_code

# move forward with code
if status_code==200:
    # convert the get output to a dictionary
    response_dict = response.json()
    # convert to a dataframe
    res = pd.DataFrame([response_dict["results"][0]])
else:
    print(status_code)
    
# print the activity
res

Exploring API Filters¶

If we look at the documentation, you see the APIs provides filters (query parameters) that allow you to refine your search.

For example, when you send a get request to the Youtube API, you are not interested in the entire Youtube data. You want data associated with certain videos, profiles, for a certain period of time, for example. These filters are often embedded as query parameters in the API call.

To add a query parameter to a given URL, you have to add a question mark (?) before the first query parameter. If you want to have multiple query parameters in your request, then you can split them with an ampersand (&)

We can add filters by:

constructing the full API call
Using dictionaries

Filter with the full API cal¶

## get only recreational activities
# build query
query = "https://opentdb.com/api.php"

# add filter
activity = "?amount=10"

# full request
url = query + activity

# Make a get request 
response = requests.get(url)

# see json
response.json()

{'response_code': 0,
 'results': [{'type': 'multiple',
   'difficulty': 'hard',
   'category': 'Entertainment: Video Games',
   'question': '&quot;Strangereal&quot; is a fictitious Earth-like world for which game series?',
   'correct_answer': 'Ace Combat',
   'incorrect_answers': ['Jet Set Radio', 'Deus Ex', 'Crimson Skies']},
  {'type': 'multiple',
   'difficulty': 'medium',
   'category': 'Entertainment: Music',
   'question': 'Who had a 1976 hit with the song &#039;You Make Me Feel Like Dancing&#039;?',
   'correct_answer': 'Leo Sayer',
   'incorrect_answers': ['Elton John', 'Billy Joel', 'Andy Gibb']},
  {'type': 'boolean',
   'difficulty': 'easy',
   'category': 'Politics',
   'question': 'In 2016, the United Kingdom voted to stay in the EU.',
   'correct_answer': 'False',
   'incorrect_answers': ['True']},
  {'type': 'multiple',
   'difficulty': 'medium',
   'category': 'Entertainment: Video Games',
   'question': 'In Terraria, what does the Wall of Flesh not drop upon defeat?',
   'correct_answer': 'Picksaw',
   'incorrect_answers': ['Pwnhammer', 'Breaker Blade', 'Laser Rifle']},
  {'type': 'multiple',
   'difficulty': 'medium',
   'category': 'Entertainment: Video Games',
   'question': 'In what year were screenshots added to Steam?',
   'correct_answer': '2011',
   'incorrect_answers': ['2010', '2008', '2009']},
  {'type': 'multiple',
   'difficulty': 'hard',
   'category': 'Entertainment: Music',
   'question': 'Which member of the Wu-Tang Clan had only one verse in their debut album Enter the Wu-Tang (36 Chambers)?',
   'correct_answer': 'Masta Killa',
   'incorrect_answers': ['Method Man', 'Inspectah Deck', 'GZA']},
  {'type': 'multiple',
   'difficulty': 'easy',
   'category': 'Entertainment: Music',
   'question': 'What was Rage Against the Machine&#039;s debut album?',
   'correct_answer': 'Rage Against the Machine',
   'incorrect_answers': ['Evil Empire',
    'Bombtrack',
    'The Battle Of Los Angeles']},
  {'type': 'boolean',
   'difficulty': 'medium',
   'category': 'Entertainment: Video Games',
   'question': 'The Fiat Multipla is a drivable car in &quot;Forza Horizon 3&quot;.',
   'correct_answer': 'False',
   'incorrect_answers': ['True']},
  {'type': 'multiple',
   'difficulty': 'easy',
   'category': 'Science &amp; Nature',
   'question': 'What is the largest animal currently on Earth?',
   'correct_answer': 'Blue Whale',
   'incorrect_answers': ['Orca', 'Colossal Squid', 'Giraffe']},
  {'type': 'multiple',
   'difficulty': 'easy',
   'category': 'Sports',
   'question': 'What team won the 2016 MLS Cup?',
   'correct_answer': 'Seattle Sounders',
   'incorrect_answers': ['Colorado Rapids', 'Toronto FC', 'Montreal Impact']}]}

Or using dictionaries¶

## get only recreational activities
# build query
query = "https://opentdb.com/api.php"

# add filter
parameters = {"amount": "10", 
             "category":"9"}

# Make a get request to get 
response = requests.get(query, params=parameters)

# see json
print(response.status_code)
response.json()

200

{'response_code': 0,
 'results': [{'type': 'multiple',
   'difficulty': 'medium',
   'category': 'General Knowledge',
   'question': 'What is the highest number of Michelin stars a restaurant can receive?',
   'correct_answer': 'Three',
   'incorrect_answers': ['Four', 'Five', 'Six']},
  {'type': 'multiple',
   'difficulty': 'medium',
   'category': 'General Knowledge',
   'question': 'When was Nintendo founded?',
   'correct_answer': 'September 23rd, 1889',
   'incorrect_answers': ['October 19th, 1891',
    'March 4th, 1887',
    'December 27th, 1894']},
  {'type': 'multiple',
   'difficulty': 'easy',
   'category': 'General Knowledge',
   'question': 'Which of the following blood component forms a plug at the site of injuries?',
   'correct_answer': 'Platelets',
   'incorrect_answers': ['Red blood cells',
    'White blood cells',
    'Blood plasma']},
  {'type': 'multiple',
   'difficulty': 'medium',
   'category': 'General Knowledge',
   'question': 'On average, Americans consume 100 pounds of what per second?',
   'correct_answer': 'Chocolate',
   'incorrect_answers': ['Potatoes', 'Donuts', 'Cocaine']},
  {'type': 'multiple',
   'difficulty': 'medium',
   'category': 'General Knowledge',
   'question': 'What is the romanized Japanese word for &quot;university&quot;?',
   'correct_answer': 'Daigaku',
   'incorrect_answers': ['Toshokan', 'Jimusho', 'Shokudou']},
  {'type': 'multiple',
   'difficulty': 'medium',
   'category': 'General Knowledge',
   'question': 'Which river flows through the Scottish city of Glasgow?',
   'correct_answer': 'Clyde',
   'incorrect_answers': ['Tay', 'Dee', 'Tweed']},
  {'type': 'multiple',
   'difficulty': 'easy',
   'category': 'General Knowledge',
   'question': 'What machine element is located in the center of fidget spinners?',
   'correct_answer': 'Bearings',
   'incorrect_answers': ['Axles', 'Gears', 'Belts']},
  {'type': 'multiple',
   'difficulty': 'easy',
   'category': 'General Knowledge',
   'question': 'What is the official language of Brazil?',
   'correct_answer': 'Portugese',
   'incorrect_answers': ['Brazilian', 'Spanish', 'English']},
  {'type': 'multiple',
   'difficulty': 'medium',
   'category': 'General Knowledge',
   'question': 'Which Italian automobile manufacturer gained majority control of U.S. automobile manufacturer Chrysler in 2011?',
   'correct_answer': 'Fiat',
   'incorrect_answers': ['Maserati', 'Alfa Romeo', 'Ferrari']},
  {'type': 'multiple',
   'difficulty': 'easy',
   'category': 'General Knowledge',
   'question': 'What zodiac sign is represented by a pair of scales?',
   'correct_answer': 'Libra',
   'incorrect_answers': ['Aries', 'Capricorn', 'Sagittarius']}]}

See... it is the same url..

response.url

'https://opentdb.com/api.php?amount=10&category=9'

pd.DataFrame(response.json()["results"])

Practice:¶

Find an simple API without authentication, and write code to get a successfull request from the API. Play around with different endpoints, filter parameters, and other options from the APIs. Be ready to present your results in class.

Here are some examples of APIs you might consider:

Numbers API: http://numbersapi.com/#42
Bored API: https://bored-api.appbrewery.com/
Dog Photo API: https://dog.ceo/dog-api/
Cat Facts API: https://catfact.ninja/
Quotes API: https://www.api-ninjas.com/api/quotes

Example 2: Yelp API.¶

Let's transition now to a more complex, and with interesting data, API. We will work with the Yelp API.

This API:

Requires us to get credentials
But does not have a wrapper to query the daya (that I know of).

See the documentation for the API here. The API has some interesting endpoints, for example:

/businesses/search - Search for businesses by keyword, category, location, price level, etc.
/businesses/{id} - Get rich business data, such as name, address, phone number, photos, Yelp rating, price levels and hours of operation.
/businesses/{business_id_or_alias}/reviews - Get up to three review excerpts for a business.
Among many other endpoints

Authentication with an API¶

Most often, the provider of an API will require you to authenticate before you can get some data. Authentication usually occures through an access token you can generate directly from the API. Depending on the type of authentication each API have in place, it can be a simple token (string) or multiple different ids (Client ID, Access Token, Client Token..)

Keep in mind that using a token is better than using a username and password for a few reasons:

Typically, you'll be accessing an API from a script. If you put your username and password in the script and someone finds it, they can take over your account.
Access tokens can have scopes and specific permissions.

To authorize your access, you need to add the token to your API call. Often, you do this by passing the token through an authorization header. We can use Python's requests library to make a dictionary of headers, and then pass it into our request.

Acquiring credentials with Yelp Fusion API¶

Information about acquiring your credentials to make API call are often displayed in the API documentation.

Here it is Yelp's information

Every API has a bit of a distinct process. In general, APIs require you to create an app to access the API. This is a bit of a weird terminology. The assumption here is that you are creating an app (think about the Botometer at Twitter) that will query the API many times.

For the YELP API, after you create the app, you will get an Client ID and an API KEY

How to save the API keys/token?¶

API keys are personal information. Keep yours safe, and do not paste into your code.

Don't do this:

api_key = "my_key"

Do this:

create a file with your keys and save as .env
Add your keys there
load them in your environment when running the APIs.
And never upload your .env file in a public server (like github)

I will show you in class what a .env file looks like.

Querying the API¶

We repeat the same steps as before, but adding an authentication step.

Step 0: Load your API Keys
Step 1: Look at the API documentation and endpoints, and construct a query of interest
Step 2: Use requests.get(querystring) to call the API
Step 3: Examine the response
Step 4: Extract your data and save it.

Step 0: Load your API Keys¶

# load library to get environmental files
import os
from dotenv import load_dotenv


# load keys from  environmental var
load_dotenv() # .env file in cwd
yelp_client = os.environ.get("yelp_client_id") 
yelp_key = os.environ.get("yelp_api_key")

# OR JUST HARD CODE YOUR API KEY HERE. NOT A GREAT PRACTICE!!!
#yelp_key = "ADD_YOUR_KEY_HERE"

# save your token in the header of the call
header = {'Authorization': f'Bearer {yelp_key}'}

# see here
header["Authorization"][0:50]

'Bearer syM5u9r4OFOcdp-ApFx8wD6GEDKaG97kUs9xiO9jQSt'

Step 1: Look at the API documentation and endpoints, and construct a query of interest¶

We will query the /businesses/search endpoint. Let's check together the documentation here: https://docs.developer.yelp.com/reference/v3_business_search

We will use two parameters:

location: This string indicates the geographic area to be used when searching for businesses
term: Search term, e.g. "food" or "restaurants".

# endpoint
endpoint = "https://api.yelp.com/v3/businesses/search"

# Add as parameters
params ={"location":" Washington, DC 20057",
        "term":"best noodles restaurant"}

Step 2: Use requests.get(endpoint) to call the API¶

# Make a get request with header + parameters
response = requests.get(endpoint, 
                        headers=header,
                        params=params)

Step 3: Examine the response¶

Let's check the response code

# looking for a 200
response.status_code

200

Step 4: Extract your data and save it.¶

# What does the response look like?
yelp_json = response.json()

# print
print(yelp_json)

{'businesses': [{'id': 'XKOVFGUCK1e0vZBML4ddxw', 'alias': 'donsak-thai-restaurant-washington', 'name': 'Donsak Thai Restaurant ', 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/btbTD5jafxl-eyTstndhSQ/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/donsak-thai-restaurant-washington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 97, 'categories': [{'alias': 'thai', 'title': 'Thai'}], 'rating': 4.4, 'coordinates': {'latitude': 38.92398, 'longitude': -77.05212}, 'transactions': ['delivery', 'pickup'], 'location': {'address1': '2608 Connecticut Ave NW', 'address2': '', 'address3': None, 'city': 'Washington, DC', 'zip_code': '20008', 'country': 'US', 'state': 'DC', 'display_address': ['2608 Connecticut Ave NW', 'Washington, DC 20008']}, 'phone': '+12025078207', 'display_phone': '(202) 507-8207', 'distance': 2610.575512747587}, {'id': 'SPWt2Gqb2-alIq78YINs-w', 'alias': 'toryumon-japanese-house-arlington-2', 'name': 'Toryumon Japanese House', 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/XAoGmtZrTZpIabxbIxnzxQ/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/toryumon-japanese-house-arlington-2?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 234, 'categories': [{'alias': 'sushi', 'title': 'Sushi Bars'}, {'alias': 'ramen', 'title': 'Ramen'}, {'alias': 'asianfusion', 'title': 'Asian Fusion'}], 'rating': 4.3, 'coordinates': {'latitude': 38.89405977663849, 'longitude': -77.07763317257343}, 'transactions': ['restaurant_reservation', 'delivery', 'pickup'], 'price': '$$', 'location': {'address1': '1650 Wilson Blvd', 'address2': 'Ste 100B', 'address3': '', 'city': 'Arlington', 'zip_code': '22209', 'country': 'US', 'state': 'VA', 'display_address': ['1650 Wilson Blvd', 'Ste 100B', 'Arlington, VA 22209']}, 'phone': '+15713571537', 'display_phone': '(571) 357-1537', 'distance': 1700.4810849971807}, {'id': 'eV_87BqGbpvTqUwjOgQO5g', 'alias': 'reren-washington-3', 'name': 'Reren', 'image_url': 'https://s3-media3.fl.yelpcdn.com/bphoto/PxQkgCY0DJG8ymZJgRa1Aw/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/reren-washington-3?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 132, 'categories': [{'alias': 'asianfusion', 'title': 'Asian Fusion'}, {'alias': 'ramen', 'title': 'Ramen'}, {'alias': 'nightlife', 'title': 'Nightlife'}], 'rating': 3.9, 'coordinates': {'latitude': 38.90468, 'longitude': -77.06262}, 'transactions': ['delivery', 'pickup'], 'price': '$$', 'location': {'address1': '1073 Wisconsin Ave NW', 'address2': '2 floor', 'address3': '', 'city': 'Washington, DC', 'zip_code': '20007', 'country': 'US', 'state': 'DC', 'display_address': ['1073 Wisconsin Ave NW', '2 floor', 'Washington, DC 20007']}, 'phone': '+12028044962', 'display_phone': '(202) 804-4962', 'distance': 1231.3769005643042}, {'id': 'DB9hhm2cB9Iu88RQw6aqCQ', 'alias': 'thai-and-time-again-washington-2', 'name': 'Thai And Time Again', 'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/ESUUM3pXZuOxljGo_YjMHA/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/thai-and-time-again-washington-2?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 50, 'categories': [{'alias': 'thai', 'title': 'Thai'}, {'alias': 'noodles', 'title': 'Noodles'}, {'alias': 'soup', 'title': 'Soup'}], 'rating': 4.6, 'coordinates': {'latitude': 38.92376109199186, 'longitude': -77.05098199999999}, 'transactions': ['delivery', 'pickup'], 'location': {'address1': '2311 Calvert St NW', 'address2': None, 'address3': '', 'city': 'Washington, DC', 'zip_code': '20748', 'country': 'US', 'state': 'DC', 'display_address': ['2311 Calvert St NW', 'Washington, DC 20748']}, 'phone': '+12025061076', 'display_phone': '(202) 506-1076', 'distance': 2668.901780760328}, {'id': 'RiwIUBITUfhn6etk6qVbnQ', 'alias': 'eerkins-uyghur-cuisine-washington-2', 'name': "Eerkin's Uyghur Cuisine", 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/GbQu1AtPp5XrzNrKMDgMXg/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/eerkins-uyghur-cuisine-washington-2?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 197, 'categories': [{'alias': 'noodles', 'title': 'Noodles'}, {'alias': 'kebab', 'title': 'Kebab'}, {'alias': 'tea', 'title': 'Tea Rooms'}], 'rating': 4.3, 'coordinates': {'latitude': 38.921397, 'longitude': -77.072513}, 'transactions': ['delivery', 'pickup'], 'price': '$$', 'location': {'address1': '2412 Wisconsin Ave NW', 'address2': '', 'address3': None, 'city': 'Washington, DC', 'zip_code': '20007', 'country': 'US', 'state': 'DC', 'display_address': ['2412 Wisconsin Ave NW', 'Washington, DC 20007']}, 'phone': '+12023333600', 'display_phone': '(202) 333-3600', 'distance': 1372.54495989159}, {'id': 'QanUICteMAzlK7jVADa1JA', 'alias': 'oki-bowl-at-georgetown-washington-2', 'name': 'OKI bowl at Georgetown', 'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/2AaW1GnOKWoEOB0v_5F_4Q/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/oki-bowl-at-georgetown-washington-2?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 290, 'categories': [{'alias': 'ramen', 'title': 'Ramen'}], 'rating': 3.9, 'coordinates': {'latitude': 38.91107, 'longitude': -77.06552}, 'transactions': ['delivery', 'pickup'], 'price': '$$', 'location': {'address1': '1608 Wisconsin Ave NW', 'address2': '', 'address3': None, 'city': 'Washington, DC', 'zip_code': '20007', 'country': 'US', 'state': 'DC', 'display_address': ['1608 Wisconsin Ave NW', 'Washington, DC 20007']}, 'phone': '+12029448660', 'display_phone': '(202) 944-8660', 'distance': 893.7070290494285}, {'id': 'M9MhHHorL39Gf-Mz7N9oEA', 'alias': 'the-happy-eatery-arlington', 'name': 'The Happy Eatery', 'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/G3Q8DWhmqL70_1zy3y9ADg/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/the-happy-eatery-arlington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 250, 'categories': [{'alias': 'vietnamese', 'title': 'Vietnamese'}, {'alias': 'coffee', 'title': 'Coffee & Tea'}, {'alias': 'asianfusion', 'title': 'Asian Fusion'}], 'rating': 4.1, 'coordinates': {'latitude': 38.896769, 'longitude': -77.071233}, 'transactions': ['delivery'], 'price': '$$', 'location': {'address1': '1800 N Lynn St', 'address2': None, 'address3': '', 'city': 'Arlington', 'zip_code': '22209', 'country': 'US', 'state': 'VA', 'display_address': ['1800 N Lynn St', 'Arlington, VA 22209']}, 'phone': '+15718001881', 'display_phone': '(571) 800-1881', 'distance': 1454.6734150230877}, {'id': 'iHhrBAMa833_hkYfZ5fDoQ', 'alias': 'simply-banh-mi-washington-6', 'name': 'Simply Banh Mi', 'image_url': 'https://s3-media3.fl.yelpcdn.com/bphoto/YPC8j-yKaqSQcveM54qcng/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/simply-banh-mi-washington-6?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 657, 'categories': [{'alias': 'vietnamese', 'title': 'Vietnamese'}, {'alias': 'halal', 'title': 'Halal'}, {'alias': 'soup', 'title': 'Soup'}], 'rating': 4.5, 'coordinates': {'latitude': 38.9114651, 'longitude': -77.0656503171576}, 'transactions': ['delivery', 'pickup'], 'price': '$', 'location': {'address1': '1624 Wisconsin Ave NW', 'address2': '', 'address3': '', 'city': 'Washington, DC', 'zip_code': '20007', 'country': 'US', 'state': 'DC', 'display_address': ['1624 Wisconsin Ave NW', 'Washington, DC 20007']}, 'phone': '+12023335726', 'display_phone': '(202) 333-5726', 'distance': 892.5436374838213}, {'id': 'MmgIn8Ufynn8v6lzgLrX-A', 'alias': 'han-palace-washington-3', 'name': 'Han Palace', 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/Cn_fDw90aeqRCRkaivxd3w/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/han-palace-washington-3?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 28, 'categories': [{'alias': 'dimsum', 'title': 'Dim Sum'}, {'alias': 'cantonese', 'title': 'Cantonese'}, {'alias': 'buffets', 'title': 'Buffets'}], 'rating': 3.6, 'coordinates': {'latitude': 38.914425669693635, 'longitude': -77.06730374662718}, 'transactions': ['delivery', 'pickup'], 'location': {'address1': '1728 Wisconsin Ave NW', 'address2': '', 'address3': None, 'city': 'Washington, DC', 'zip_code': '20007', 'country': 'US', 'state': 'DC', 'display_address': ['1728 Wisconsin Ave NW', 'Washington, DC 20007']}, 'phone': '+12023556725', 'display_phone': '(202) 355-6725', 'distance': 916.8751740548753}, {'id': 'ws2OEuBG41rB5LOq_OE_qg', 'alias': 'kusshi-glover-park-washington', 'name': 'Kusshi - Glover Park', 'image_url': 'https://s3-media4.fl.yelpcdn.com/bphoto/LVjCRHER9ye-tlr31F8Z1Q/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/kusshi-glover-park-washington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 13, 'categories': [{'alias': 'sushi', 'title': 'Sushi Bars'}, {'alias': 'japanese', 'title': 'Japanese'}], 'rating': 4.5, 'coordinates': {'latitude': 38.920617, 'longitude': -77.071394}, 'transactions': ['delivery', 'pickup'], 'location': {'address1': '2309 Wisconsin Ave', 'address2': None, 'address3': '', 'city': 'Washington, DC', 'zip_code': '20007', 'country': 'US', 'state': 'DC', 'display_address': ['2309 Wisconsin Ave', 'Washington, DC 20007']}, 'phone': '+12023333986', 'display_phone': '(202) 333-3986', 'distance': 1312.4811314043586}, {'id': '7cGsqMdKLNV-eXItawA6mA', 'alias': 'reren-lamen-n-bar-washington', 'name': 'Reren Lamen n Bar', 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/PG686Kb10oKrZKdUhcKLPg/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/reren-lamen-n-bar-washington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 45, 'categories': [{'alias': 'ramen', 'title': 'Ramen'}, {'alias': 'bars', 'title': 'Bars'}, {'alias': 'noodles', 'title': 'Noodles'}], 'rating': 3.8, 'coordinates': {'latitude': 38.90468, 'longitude': -77.06241}, 'transactions': [], 'location': {'address1': '1073 Wisconsin Ave NW', 'address2': '', 'address3': None, 'city': 'Washington, DC', 'zip_code': '20007', 'country': 'US', 'state': 'DC', 'display_address': ['1073 Wisconsin Ave NW', 'Washington, DC 20007']}, 'phone': '+12024506654', 'display_phone': '(202) 450-6654', 'distance': 1240.359423212593}, {'id': 'dIMNqrq_vOHpLcOdcPCVpA', 'alias': 'saigon-noodles-and-grill-arlington-2', 'name': 'Saigon Noodles & Grill', 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/OiJgHDhsBdSxEr6HyLXiYA/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/saigon-noodles-and-grill-arlington-2?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 193, 'categories': [{'alias': 'vietnamese', 'title': 'Vietnamese'}], 'rating': 3.8, 'coordinates': {'latitude': 38.89396413507029, 'longitude': -77.07882922883567}, 'transactions': ['delivery', 'pickup'], 'price': '$$', 'location': {'address1': '1800 Wilson Blvd', 'address2': '', 'address3': None, 'city': 'Arlington', 'zip_code': '22201', 'country': 'US', 'state': 'VA', 'display_address': ['1800 Wilson Blvd', 'Arlington, VA 22201']}, 'phone': '+17035665940', 'display_phone': '(703) 566-5940', 'distance': 1724.9207281452616}, {'id': 'D14Ucgt6SytND1-YwoQVZg', 'alias': 'seoulspice-arlington', 'name': 'SeoulSpice', 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/0R0uMJAZfoNMUKpY2vm4-w/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/seoulspice-arlington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 64, 'categories': [{'alias': 'korean', 'title': 'Korean'}, {'alias': 'gluten_free', 'title': 'Gluten-Free'}, {'alias': 'comfortfood', 'title': 'Comfort Food'}], 'rating': 4.4, 'coordinates': {'latitude': 38.89580354043643, 'longitude': -77.07056576100732}, 'transactions': ['delivery', 'pickup'], 'location': {'address1': '1735 N Lynn St', 'address2': 'Ste 106', 'address3': '', 'city': 'Arlington', 'zip_code': '22209', 'country': 'US', 'state': 'VA', 'display_address': ['1735 N Lynn St', 'Ste 106', 'Arlington, VA 22209']}, 'phone': '+17034195868', 'display_phone': '(703) 419-5868', 'distance': 1558.6175101395754}, {'id': 'gwq-QIb-gxNRVAVRuRhLAQ', 'alias': 'billy-hicks-washington', 'name': 'Billy Hicks', 'image_url': 'https://s3-media4.fl.yelpcdn.com/bphoto/wcoLkGq_JT-90uRTwmTN6A/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/billy-hicks-washington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 4, 'categories': [{'alias': 'tradamerican', 'title': 'American'}, {'alias': 'coffee', 'title': 'Coffee & Tea'}, {'alias': 'beverage_stores', 'title': 'Beverage Store'}], 'rating': 4.5, 'coordinates': {'latitude': 38.9053016170736, 'longitude': -77.0653851}, 'transactions': ['delivery', 'pickup'], 'location': {'address1': '3277 M St NW', 'address2': '', 'address3': None, 'city': 'Washington, DC', 'zip_code': '20007', 'country': 'US', 'state': 'DC', 'display_address': ['3277 M St NW', 'Washington, DC 20007']}, 'phone': '+12027925757', 'display_phone': '(202) 792-5757', 'distance': 985.6443541150337}, {'id': 'xmVrPFMaJ5ko3o1wxjnIZg', 'alias': 'han-palace-washington-2', 'name': 'Han Palace', 'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/HCj0TSI6avxGk7wVfnp26A/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/han-palace-washington-2?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 295, 'categories': [{'alias': 'dimsum', 'title': 'Dim Sum'}, {'alias': 'cantonese', 'title': 'Cantonese'}], 'rating': 3.9, 'coordinates': {'latitude': 38.92497, 'longitude': -77.05199}, 'transactions': ['pickup'], 'price': '$$', 'location': {'address1': '2649 Connecticut Ave NW', 'address2': '', 'address3': None, 'city': 'Washington, DC', 'zip_code': '20008', 'country': 'US', 'state': 'DC', 'display_address': ['2649 Connecticut Ave NW', 'Washington, DC 20008']}, 'phone': '+12029690018', 'display_phone': '(202) 969-0018', 'distance': 2687.3966671698704}, {'id': 'Ek_-kvajIvVJbi3ll4pMww', 'alias': 'pho-75-arlington', 'name': 'Pho 75', 'image_url': 'https://s3-media4.fl.yelpcdn.com/bphoto/g-xbssrgCv1z3zCDRTkKPQ/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/pho-75-arlington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 2098, 'categories': [{'alias': 'vietnamese', 'title': 'Vietnamese'}, {'alias': 'noodles', 'title': 'Noodles'}], 'rating': 4.1, 'coordinates': {'latitude': 38.8941969403826, 'longitude': -77.0788539337479}, 'transactions': ['delivery'], 'price': '$$', 'location': {'address1': '1721 Wilson Blvd', 'address2': None, 'address3': '', 'city': 'Arlington', 'zip_code': '22209', 'country': 'US', 'state': 'VA', 'display_address': ['1721 Wilson Blvd', 'Arlington, VA 22209']}, 'phone': '+17035257355', 'display_phone': '(703) 525-7355', 'distance': 1699.7430086974475}, {'id': 'nIgDD3Gtq8Ya1ddJ1w1IMQ', 'alias': 'tnr-cafe-arlington-6', 'name': 'TNR Cafe', 'image_url': 'https://s3-media4.fl.yelpcdn.com/bphoto/NWv_PVBFU8vCQZvCqmBwyw/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/tnr-cafe-arlington-6?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 838, 'categories': [{'alias': 'chinese', 'title': 'Chinese'}, {'alias': 'coffee', 'title': 'Coffee & Tea'}], 'rating': 3.8, 'coordinates': {'latitude': 38.89186, 'longitude': -77.08403}, 'transactions': ['delivery', 'pickup'], 'price': '$$', 'location': {'address1': '2049 Wilson Blvd', 'address2': None, 'address3': '', 'city': 'Arlington', 'zip_code': '22201', 'country': 'US', 'state': 'VA', 'display_address': ['2049 Wilson Blvd', 'Arlington, VA 22201']}, 'phone': '+17038750428', 'display_phone': '(703) 875-0428', 'distance': 2060.3458473057403}, {'id': 'vzvuB2t9-t3kqtINjjhPig', 'alias': 'buuz-thai-eatery-arlington', 'name': 'Buuz Thai Eatery', 'image_url': 'https://s3-media4.fl.yelpcdn.com/bphoto/PoCArXAH2F-55o2pq5iNzA/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/buuz-thai-eatery-arlington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 294, 'categories': [{'alias': 'thai', 'title': 'Thai'}, {'alias': 'mongolian', 'title': 'Mongolian'}], 'rating': 3.8, 'coordinates': {'latitude': 38.8926182923642, 'longitude': -77.0813833379928}, 'transactions': ['delivery', 'pickup'], 'price': '$$', 'location': {'address1': '1926 Wilson Blvd', 'address2': None, 'address3': '', 'city': 'Arlington', 'zip_code': '22201', 'country': 'US', 'state': 'VA', 'display_address': ['1926 Wilson Blvd', 'Arlington, VA 22201']}, 'phone': '+17038753999', 'display_phone': '(703) 875-3999', 'distance': 1918.4698575998193}, {'id': 'qGO0rs-uNANLexpMZldTtA', 'alias': 'momos-cafe-washington', 'name': "Momo's Cafe", 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/RWajPcG_NmcJ_PbQiCSCzQ/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/momos-cafe-washington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 201, 'categories': [{'alias': 'asianfusion', 'title': 'Asian Fusion'}, {'alias': 'coffee', 'title': 'Coffee & Tea'}, {'alias': 'taiwanese', 'title': 'Taiwanese'}], 'rating': 4.1, 'coordinates': {'latitude': 38.91684, 'longitude': -77.0965}, 'transactions': ['delivery', 'pickup'], 'price': '$$', 'location': {'address1': '4828 MacArthur Blvd NW', 'address2': None, 'address3': '', 'city': 'Washington, DC', 'zip_code': '20007', 'country': 'US', 'state': 'DC', 'display_address': ['4828 MacArthur Blvd NW', 'Washington, DC 20007']}, 'phone': '+12023333675', 'display_phone': '(202) 333-3675', 'distance': 1996.4432142570759}, {'id': '6MOlZLcCA22fhbGecugyhA', 'alias': 'kappo-dc-washington', 'name': 'Kappo DC', 'image_url': 'https://s3-media4.fl.yelpcdn.com/bphoto/JzUKbeameHC4ryn8qaW0tw/o.jpg', 'is_closed': False, 'url': 'https://www.yelp.com/biz/kappo-dc-washington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg', 'review_count': 88, 'categories': [{'alias': 'japanese', 'title': 'Japanese'}], 'rating': 4.5, 'coordinates': {'latitude': 38.91674042257224, 'longitude': -77.0963584}, 'transactions': [], 'price': '$$$$', 'location': {'address1': '4822 MacArthur Blvd NW', 'address2': '', 'address3': None, 'city': 'Washington, DC', 'zip_code': '20007', 'country': 'US', 'state': 'DC', 'display_address': ['4822 MacArthur Blvd NW', 'Washington, DC 20007']}, 'phone': '+12028859086', 'display_phone': '(202) 885-9086', 'distance': 1981.1053054242918}], 'total': 117, 'region': {'center': {'longitude': -77.07557201385498, 'latitude': 38.909268231541304}}}

yelp_json.keys()

dict_keys(['businesses', 'total', 'region'])

yelp_json["businesses"]

[{'id': 'XKOVFGUCK1e0vZBML4ddxw',
  'alias': 'donsak-thai-restaurant-washington',
  'name': 'Donsak Thai Restaurant ',
  'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/btbTD5jafxl-eyTstndhSQ/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/donsak-thai-restaurant-washington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 97,
  'categories': [{'alias': 'thai', 'title': 'Thai'}],
  'rating': 4.4,
  'coordinates': {'latitude': 38.92398, 'longitude': -77.05212},
  'transactions': ['delivery', 'pickup'],
  'location': {'address1': '2608 Connecticut Ave NW',
   'address2': '',
   'address3': None,
   'city': 'Washington, DC',
   'zip_code': '20008',
   'country': 'US',
   'state': 'DC',
   'display_address': ['2608 Connecticut Ave NW', 'Washington, DC 20008']},
  'phone': '+12025078207',
  'display_phone': '(202) 507-8207',
  'distance': 2610.575512747587},
 {'id': 'SPWt2Gqb2-alIq78YINs-w',
  'alias': 'toryumon-japanese-house-arlington-2',
  'name': 'Toryumon Japanese House',
  'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/XAoGmtZrTZpIabxbIxnzxQ/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/toryumon-japanese-house-arlington-2?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 234,
  'categories': [{'alias': 'sushi', 'title': 'Sushi Bars'},
   {'alias': 'ramen', 'title': 'Ramen'},
   {'alias': 'asianfusion', 'title': 'Asian Fusion'}],
  'rating': 4.3,
  'coordinates': {'latitude': 38.89405977663849,
   'longitude': -77.07763317257343},
  'transactions': ['restaurant_reservation', 'delivery', 'pickup'],
  'price': '$$',
  'location': {'address1': '1650 Wilson Blvd',
   'address2': 'Ste 100B',
   'address3': '',
   'city': 'Arlington',
   'zip_code': '22209',
   'country': 'US',
   'state': 'VA',
   'display_address': ['1650 Wilson Blvd', 'Ste 100B', 'Arlington, VA 22209']},
  'phone': '+15713571537',
  'display_phone': '(571) 357-1537',
  'distance': 1700.4810849971807},
 {'id': 'eV_87BqGbpvTqUwjOgQO5g',
  'alias': 'reren-washington-3',
  'name': 'Reren',
  'image_url': 'https://s3-media3.fl.yelpcdn.com/bphoto/PxQkgCY0DJG8ymZJgRa1Aw/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/reren-washington-3?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 132,
  'categories': [{'alias': 'asianfusion', 'title': 'Asian Fusion'},
   {'alias': 'ramen', 'title': 'Ramen'},
   {'alias': 'nightlife', 'title': 'Nightlife'}],
  'rating': 3.9,
  'coordinates': {'latitude': 38.90468, 'longitude': -77.06262},
  'transactions': ['delivery', 'pickup'],
  'price': '$$',
  'location': {'address1': '1073 Wisconsin Ave NW',
   'address2': '2 floor',
   'address3': '',
   'city': 'Washington, DC',
   'zip_code': '20007',
   'country': 'US',
   'state': 'DC',
   'display_address': ['1073 Wisconsin Ave NW',
    '2 floor',
    'Washington, DC 20007']},
  'phone': '+12028044962',
  'display_phone': '(202) 804-4962',
  'distance': 1231.3769005643042},
 {'id': 'DB9hhm2cB9Iu88RQw6aqCQ',
  'alias': 'thai-and-time-again-washington-2',
  'name': 'Thai And Time Again',
  'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/ESUUM3pXZuOxljGo_YjMHA/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/thai-and-time-again-washington-2?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 50,
  'categories': [{'alias': 'thai', 'title': 'Thai'},
   {'alias': 'noodles', 'title': 'Noodles'},
   {'alias': 'soup', 'title': 'Soup'}],
  'rating': 4.6,
  'coordinates': {'latitude': 38.92376109199186,
   'longitude': -77.05098199999999},
  'transactions': ['delivery', 'pickup'],
  'location': {'address1': '2311 Calvert St NW',
   'address2': None,
   'address3': '',
   'city': 'Washington, DC',
   'zip_code': '20748',
   'country': 'US',
   'state': 'DC',
   'display_address': ['2311 Calvert St NW', 'Washington, DC 20748']},
  'phone': '+12025061076',
  'display_phone': '(202) 506-1076',
  'distance': 2668.901780760328},
 {'id': 'RiwIUBITUfhn6etk6qVbnQ',
  'alias': 'eerkins-uyghur-cuisine-washington-2',
  'name': "Eerkin's Uyghur Cuisine",
  'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/GbQu1AtPp5XrzNrKMDgMXg/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/eerkins-uyghur-cuisine-washington-2?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 197,
  'categories': [{'alias': 'noodles', 'title': 'Noodles'},
   {'alias': 'kebab', 'title': 'Kebab'},
   {'alias': 'tea', 'title': 'Tea Rooms'}],
  'rating': 4.3,
  'coordinates': {'latitude': 38.921397, 'longitude': -77.072513},
  'transactions': ['delivery', 'pickup'],
  'price': '$$',
  'location': {'address1': '2412 Wisconsin Ave NW',
   'address2': '',
   'address3': None,
   'city': 'Washington, DC',
   'zip_code': '20007',
   'country': 'US',
   'state': 'DC',
   'display_address': ['2412 Wisconsin Ave NW', 'Washington, DC 20007']},
  'phone': '+12023333600',
  'display_phone': '(202) 333-3600',
  'distance': 1372.54495989159},
 {'id': 'QanUICteMAzlK7jVADa1JA',
  'alias': 'oki-bowl-at-georgetown-washington-2',
  'name': 'OKI bowl at Georgetown',
  'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/2AaW1GnOKWoEOB0v_5F_4Q/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/oki-bowl-at-georgetown-washington-2?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 290,
  'categories': [{'alias': 'ramen', 'title': 'Ramen'}],
  'rating': 3.9,
  'coordinates': {'latitude': 38.91107, 'longitude': -77.06552},
  'transactions': ['delivery', 'pickup'],
  'price': '$$',
  'location': {'address1': '1608 Wisconsin Ave NW',
   'address2': '',
   'address3': None,
   'city': 'Washington, DC',
   'zip_code': '20007',
   'country': 'US',
   'state': 'DC',
   'display_address': ['1608 Wisconsin Ave NW', 'Washington, DC 20007']},
  'phone': '+12029448660',
  'display_phone': '(202) 944-8660',
  'distance': 893.7070290494285},
 {'id': 'M9MhHHorL39Gf-Mz7N9oEA',
  'alias': 'the-happy-eatery-arlington',
  'name': 'The Happy Eatery',
  'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/G3Q8DWhmqL70_1zy3y9ADg/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/the-happy-eatery-arlington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 250,
  'categories': [{'alias': 'vietnamese', 'title': 'Vietnamese'},
   {'alias': 'coffee', 'title': 'Coffee & Tea'},
   {'alias': 'asianfusion', 'title': 'Asian Fusion'}],
  'rating': 4.1,
  'coordinates': {'latitude': 38.896769, 'longitude': -77.071233},
  'transactions': ['delivery'],
  'price': '$$',
  'location': {'address1': '1800 N Lynn St',
   'address2': None,
   'address3': '',
   'city': 'Arlington',
   'zip_code': '22209',
   'country': 'US',
   'state': 'VA',
   'display_address': ['1800 N Lynn St', 'Arlington, VA 22209']},
  'phone': '+15718001881',
  'display_phone': '(571) 800-1881',
  'distance': 1454.6734150230877},
 {'id': 'iHhrBAMa833_hkYfZ5fDoQ',
  'alias': 'simply-banh-mi-washington-6',
  'name': 'Simply Banh Mi',
  'image_url': 'https://s3-media3.fl.yelpcdn.com/bphoto/YPC8j-yKaqSQcveM54qcng/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/simply-banh-mi-washington-6?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 657,
  'categories': [{'alias': 'vietnamese', 'title': 'Vietnamese'},
   {'alias': 'halal', 'title': 'Halal'},
   {'alias': 'soup', 'title': 'Soup'}],
  'rating': 4.5,
  'coordinates': {'latitude': 38.9114651, 'longitude': -77.0656503171576},
  'transactions': ['delivery', 'pickup'],
  'price': '$',
  'location': {'address1': '1624 Wisconsin Ave NW',
   'address2': '',
   'address3': '',
   'city': 'Washington, DC',
   'zip_code': '20007',
   'country': 'US',
   'state': 'DC',
   'display_address': ['1624 Wisconsin Ave NW', 'Washington, DC 20007']},
  'phone': '+12023335726',
  'display_phone': '(202) 333-5726',
  'distance': 892.5436374838213},
 {'id': 'MmgIn8Ufynn8v6lzgLrX-A',
  'alias': 'han-palace-washington-3',
  'name': 'Han Palace',
  'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/Cn_fDw90aeqRCRkaivxd3w/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/han-palace-washington-3?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 28,
  'categories': [{'alias': 'dimsum', 'title': 'Dim Sum'},
   {'alias': 'cantonese', 'title': 'Cantonese'},
   {'alias': 'buffets', 'title': 'Buffets'}],
  'rating': 3.6,
  'coordinates': {'latitude': 38.914425669693635,
   'longitude': -77.06730374662718},
  'transactions': ['delivery', 'pickup'],
  'location': {'address1': '1728 Wisconsin Ave NW',
   'address2': '',
   'address3': None,
   'city': 'Washington, DC',
   'zip_code': '20007',
   'country': 'US',
   'state': 'DC',
   'display_address': ['1728 Wisconsin Ave NW', 'Washington, DC 20007']},
  'phone': '+12023556725',
  'display_phone': '(202) 355-6725',
  'distance': 916.8751740548753},
 {'id': 'ws2OEuBG41rB5LOq_OE_qg',
  'alias': 'kusshi-glover-park-washington',
  'name': 'Kusshi - Glover Park',
  'image_url': 'https://s3-media4.fl.yelpcdn.com/bphoto/LVjCRHER9ye-tlr31F8Z1Q/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/kusshi-glover-park-washington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 13,
  'categories': [{'alias': 'sushi', 'title': 'Sushi Bars'},
   {'alias': 'japanese', 'title': 'Japanese'}],
  'rating': 4.5,
  'coordinates': {'latitude': 38.920617, 'longitude': -77.071394},
  'transactions': ['delivery', 'pickup'],
  'location': {'address1': '2309 Wisconsin Ave',
   'address2': None,
   'address3': '',
   'city': 'Washington, DC',
   'zip_code': '20007',
   'country': 'US',
   'state': 'DC',
   'display_address': ['2309 Wisconsin Ave', 'Washington, DC 20007']},
  'phone': '+12023333986',
  'display_phone': '(202) 333-3986',
  'distance': 1312.4811314043586},
 {'id': '7cGsqMdKLNV-eXItawA6mA',
  'alias': 'reren-lamen-n-bar-washington',
  'name': 'Reren Lamen n Bar',
  'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/PG686Kb10oKrZKdUhcKLPg/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/reren-lamen-n-bar-washington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 45,
  'categories': [{'alias': 'ramen', 'title': 'Ramen'},
   {'alias': 'bars', 'title': 'Bars'},
   {'alias': 'noodles', 'title': 'Noodles'}],
  'rating': 3.8,
  'coordinates': {'latitude': 38.90468, 'longitude': -77.06241},
  'transactions': [],
  'location': {'address1': '1073 Wisconsin Ave NW',
   'address2': '',
   'address3': None,
   'city': 'Washington, DC',
   'zip_code': '20007',
   'country': 'US',
   'state': 'DC',
   'display_address': ['1073 Wisconsin Ave NW', 'Washington, DC 20007']},
  'phone': '+12024506654',
  'display_phone': '(202) 450-6654',
  'distance': 1240.359423212593},
 {'id': 'dIMNqrq_vOHpLcOdcPCVpA',
  'alias': 'saigon-noodles-and-grill-arlington-2',
  'name': 'Saigon Noodles & Grill',
  'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/OiJgHDhsBdSxEr6HyLXiYA/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/saigon-noodles-and-grill-arlington-2?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 193,
  'categories': [{'alias': 'vietnamese', 'title': 'Vietnamese'}],
  'rating': 3.8,
  'coordinates': {'latitude': 38.89396413507029,
   'longitude': -77.07882922883567},
  'transactions': ['delivery', 'pickup'],
  'price': '$$',
  'location': {'address1': '1800 Wilson Blvd',
   'address2': '',
   'address3': None,
   'city': 'Arlington',
   'zip_code': '22201',
   'country': 'US',
   'state': 'VA',
   'display_address': ['1800 Wilson Blvd', 'Arlington, VA 22201']},
  'phone': '+17035665940',
  'display_phone': '(703) 566-5940',
  'distance': 1724.9207281452616},
 {'id': 'D14Ucgt6SytND1-YwoQVZg',
  'alias': 'seoulspice-arlington',
  'name': 'SeoulSpice',
  'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/0R0uMJAZfoNMUKpY2vm4-w/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/seoulspice-arlington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 64,
  'categories': [{'alias': 'korean', 'title': 'Korean'},
   {'alias': 'gluten_free', 'title': 'Gluten-Free'},
   {'alias': 'comfortfood', 'title': 'Comfort Food'}],
  'rating': 4.4,
  'coordinates': {'latitude': 38.89580354043643,
   'longitude': -77.07056576100732},
  'transactions': ['delivery', 'pickup'],
  'location': {'address1': '1735 N Lynn St',
   'address2': 'Ste 106',
   'address3': '',
   'city': 'Arlington',
   'zip_code': '22209',
   'country': 'US',
   'state': 'VA',
   'display_address': ['1735 N Lynn St', 'Ste 106', 'Arlington, VA 22209']},
  'phone': '+17034195868',
  'display_phone': '(703) 419-5868',
  'distance': 1558.6175101395754},
 {'id': 'gwq-QIb-gxNRVAVRuRhLAQ',
  'alias': 'billy-hicks-washington',
  'name': 'Billy Hicks',
  'image_url': 'https://s3-media4.fl.yelpcdn.com/bphoto/wcoLkGq_JT-90uRTwmTN6A/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/billy-hicks-washington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 4,
  'categories': [{'alias': 'tradamerican', 'title': 'American'},
   {'alias': 'coffee', 'title': 'Coffee & Tea'},
   {'alias': 'beverage_stores', 'title': 'Beverage Store'}],
  'rating': 4.5,
  'coordinates': {'latitude': 38.9053016170736, 'longitude': -77.0653851},
  'transactions': ['delivery', 'pickup'],
  'location': {'address1': '3277 M St NW',
   'address2': '',
   'address3': None,
   'city': 'Washington, DC',
   'zip_code': '20007',
   'country': 'US',
   'state': 'DC',
   'display_address': ['3277 M St NW', 'Washington, DC 20007']},
  'phone': '+12027925757',
  'display_phone': '(202) 792-5757',
  'distance': 985.6443541150337},
 {'id': 'xmVrPFMaJ5ko3o1wxjnIZg',
  'alias': 'han-palace-washington-2',
  'name': 'Han Palace',
  'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/HCj0TSI6avxGk7wVfnp26A/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/han-palace-washington-2?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 295,
  'categories': [{'alias': 'dimsum', 'title': 'Dim Sum'},
   {'alias': 'cantonese', 'title': 'Cantonese'}],
  'rating': 3.9,
  'coordinates': {'latitude': 38.92497, 'longitude': -77.05199},
  'transactions': ['pickup'],
  'price': '$$',
  'location': {'address1': '2649 Connecticut Ave NW',
   'address2': '',
   'address3': None,
   'city': 'Washington, DC',
   'zip_code': '20008',
   'country': 'US',
   'state': 'DC',
   'display_address': ['2649 Connecticut Ave NW', 'Washington, DC 20008']},
  'phone': '+12029690018',
  'display_phone': '(202) 969-0018',
  'distance': 2687.3966671698704},
 {'id': 'Ek_-kvajIvVJbi3ll4pMww',
  'alias': 'pho-75-arlington',
  'name': 'Pho 75',
  'image_url': 'https://s3-media4.fl.yelpcdn.com/bphoto/g-xbssrgCv1z3zCDRTkKPQ/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/pho-75-arlington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 2098,
  'categories': [{'alias': 'vietnamese', 'title': 'Vietnamese'},
   {'alias': 'noodles', 'title': 'Noodles'}],
  'rating': 4.1,
  'coordinates': {'latitude': 38.8941969403826,
   'longitude': -77.0788539337479},
  'transactions': ['delivery'],
  'price': '$$',
  'location': {'address1': '1721 Wilson Blvd',
   'address2': None,
   'address3': '',
   'city': 'Arlington',
   'zip_code': '22209',
   'country': 'US',
   'state': 'VA',
   'display_address': ['1721 Wilson Blvd', 'Arlington, VA 22209']},
  'phone': '+17035257355',
  'display_phone': '(703) 525-7355',
  'distance': 1699.7430086974475},
 {'id': 'nIgDD3Gtq8Ya1ddJ1w1IMQ',
  'alias': 'tnr-cafe-arlington-6',
  'name': 'TNR Cafe',
  'image_url': 'https://s3-media4.fl.yelpcdn.com/bphoto/NWv_PVBFU8vCQZvCqmBwyw/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/tnr-cafe-arlington-6?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 838,
  'categories': [{'alias': 'chinese', 'title': 'Chinese'},
   {'alias': 'coffee', 'title': 'Coffee & Tea'}],
  'rating': 3.8,
  'coordinates': {'latitude': 38.89186, 'longitude': -77.08403},
  'transactions': ['delivery', 'pickup'],
  'price': '$$',
  'location': {'address1': '2049 Wilson Blvd',
   'address2': None,
   'address3': '',
   'city': 'Arlington',
   'zip_code': '22201',
   'country': 'US',
   'state': 'VA',
   'display_address': ['2049 Wilson Blvd', 'Arlington, VA 22201']},
  'phone': '+17038750428',
  'display_phone': '(703) 875-0428',
  'distance': 2060.3458473057403},
 {'id': 'vzvuB2t9-t3kqtINjjhPig',
  'alias': 'buuz-thai-eatery-arlington',
  'name': 'Buuz Thai Eatery',
  'image_url': 'https://s3-media4.fl.yelpcdn.com/bphoto/PoCArXAH2F-55o2pq5iNzA/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/buuz-thai-eatery-arlington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 294,
  'categories': [{'alias': 'thai', 'title': 'Thai'},
   {'alias': 'mongolian', 'title': 'Mongolian'}],
  'rating': 3.8,
  'coordinates': {'latitude': 38.8926182923642,
   'longitude': -77.0813833379928},
  'transactions': ['delivery', 'pickup'],
  'price': '$$',
  'location': {'address1': '1926 Wilson Blvd',
   'address2': None,
   'address3': '',
   'city': 'Arlington',
   'zip_code': '22201',
   'country': 'US',
   'state': 'VA',
   'display_address': ['1926 Wilson Blvd', 'Arlington, VA 22201']},
  'phone': '+17038753999',
  'display_phone': '(703) 875-3999',
  'distance': 1918.4698575998193},
 {'id': 'qGO0rs-uNANLexpMZldTtA',
  'alias': 'momos-cafe-washington',
  'name': "Momo's Cafe",
  'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/RWajPcG_NmcJ_PbQiCSCzQ/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/momos-cafe-washington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 201,
  'categories': [{'alias': 'asianfusion', 'title': 'Asian Fusion'},
   {'alias': 'coffee', 'title': 'Coffee & Tea'},
   {'alias': 'taiwanese', 'title': 'Taiwanese'}],
  'rating': 4.1,
  'coordinates': {'latitude': 38.91684, 'longitude': -77.0965},
  'transactions': ['delivery', 'pickup'],
  'price': '$$',
  'location': {'address1': '4828 MacArthur Blvd NW',
   'address2': None,
   'address3': '',
   'city': 'Washington, DC',
   'zip_code': '20007',
   'country': 'US',
   'state': 'DC',
   'display_address': ['4828 MacArthur Blvd NW', 'Washington, DC 20007']},
  'phone': '+12023333675',
  'display_phone': '(202) 333-3675',
  'distance': 1996.4432142570759},
 {'id': '6MOlZLcCA22fhbGecugyhA',
  'alias': 'kappo-dc-washington',
  'name': 'Kappo DC',
  'image_url': 'https://s3-media4.fl.yelpcdn.com/bphoto/JzUKbeameHC4ryn8qaW0tw/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/kappo-dc-washington?adjust_creative=GJK5eaHUqVE8eGMl0w0Pfg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=GJK5eaHUqVE8eGMl0w0Pfg',
  'review_count': 88,
  'categories': [{'alias': 'japanese', 'title': 'Japanese'}],
  'rating': 4.5,
  'coordinates': {'latitude': 38.91674042257224, 'longitude': -77.0963584},
  'transactions': [],
  'price': '$$$$',
  'location': {'address1': '4822 MacArthur Blvd NW',
   'address2': '',
   'address3': None,
   'city': 'Washington, DC',
   'zip_code': '20007',
   'country': 'US',
   'state': 'DC',
   'display_address': ['4822 MacArthur Blvd NW', 'Washington, DC 20007']},
  'phone': '+12028859086',
  'display_phone': '(202) 885-9086',
  'distance': 1981.1053054242918}]

It returns a long dictionary with the key "businesses" and a list with multiple sub-entries.

How to deal with this data?

Approach 1: Convert all to dataframe and clean it later¶

# convert to pd
df_yelp = pd.DataFrame(yelp_json["businesses"])

# see
print(df_yelp)

# not looking realy bad.

                        id                                 alias  \
0   XKOVFGUCK1e0vZBML4ddxw     donsak-thai-restaurant-washington   
1   SPWt2Gqb2-alIq78YINs-w   toryumon-japanese-house-arlington-2   
2   eV_87BqGbpvTqUwjOgQO5g                    reren-washington-3   
3   DB9hhm2cB9Iu88RQw6aqCQ      thai-and-time-again-washington-2   
4   RiwIUBITUfhn6etk6qVbnQ   eerkins-uyghur-cuisine-washington-2   
5   QanUICteMAzlK7jVADa1JA   oki-bowl-at-georgetown-washington-2   
6   M9MhHHorL39Gf-Mz7N9oEA            the-happy-eatery-arlington   
7   iHhrBAMa833_hkYfZ5fDoQ           simply-banh-mi-washington-6   
8   MmgIn8Ufynn8v6lzgLrX-A               han-palace-washington-3   
9   ws2OEuBG41rB5LOq_OE_qg         kusshi-glover-park-washington   
10  7cGsqMdKLNV-eXItawA6mA          reren-lamen-n-bar-washington   
11  dIMNqrq_vOHpLcOdcPCVpA  saigon-noodles-and-grill-arlington-2   
12  D14Ucgt6SytND1-YwoQVZg                  seoulspice-arlington   
13  gwq-QIb-gxNRVAVRuRhLAQ                billy-hicks-washington   
14  xmVrPFMaJ5ko3o1wxjnIZg               han-palace-washington-2   
15  Ek_-kvajIvVJbi3ll4pMww                      pho-75-arlington   
16  nIgDD3Gtq8Ya1ddJ1w1IMQ                  tnr-cafe-arlington-6   
17  vzvuB2t9-t3kqtINjjhPig            buuz-thai-eatery-arlington   
18  qGO0rs-uNANLexpMZldTtA                 momos-cafe-washington   
19  6MOlZLcCA22fhbGecugyhA                   kappo-dc-washington   

                       name  \
0   Donsak Thai Restaurant    
1   Toryumon Japanese House   
2                     Reren   
3       Thai And Time Again   
4   Eerkin's Uyghur Cuisine   
5    OKI bowl at Georgetown   
6          The Happy Eatery   
7            Simply Banh Mi   
8                Han Palace   
9      Kusshi - Glover Park   
10        Reren Lamen n Bar   
11   Saigon Noodles & Grill   
12               SeoulSpice   
13              Billy Hicks   
14               Han Palace   
15                   Pho 75   
16                 TNR Cafe   
17         Buuz Thai Eatery   
18              Momo's Cafe   
19                 Kappo DC   

                                            image_url  is_closed  \
0   https://s3-media2.fl.yelpcdn.com/bphoto/btbTD5...      False   
1   https://s3-media2.fl.yelpcdn.com/bphoto/XAoGmt...      False   
2   https://s3-media3.fl.yelpcdn.com/bphoto/PxQkgC...      False   
3   https://s3-media1.fl.yelpcdn.com/bphoto/ESUUM3...      False   
4   https://s3-media2.fl.yelpcdn.com/bphoto/GbQu1A...      False   
5   https://s3-media1.fl.yelpcdn.com/bphoto/2AaW1G...      False   
6   https://s3-media1.fl.yelpcdn.com/bphoto/G3Q8DW...      False   
7   https://s3-media3.fl.yelpcdn.com/bphoto/YPC8j-...      False   
8   https://s3-media2.fl.yelpcdn.com/bphoto/Cn_fDw...      False   
9   https://s3-media4.fl.yelpcdn.com/bphoto/LVjCRH...      False   
10  https://s3-media2.fl.yelpcdn.com/bphoto/PG686K...      False   
11  https://s3-media2.fl.yelpcdn.com/bphoto/OiJgHD...      False   
12  https://s3-media2.fl.yelpcdn.com/bphoto/0R0uMJ...      False   
13  https://s3-media4.fl.yelpcdn.com/bphoto/wcoLkG...      False   
14  https://s3-media1.fl.yelpcdn.com/bphoto/HCj0TS...      False   
15  https://s3-media4.fl.yelpcdn.com/bphoto/g-xbss...      False   
16  https://s3-media4.fl.yelpcdn.com/bphoto/NWv_PV...      False   
17  https://s3-media4.fl.yelpcdn.com/bphoto/PoCArX...      False   
18  https://s3-media2.fl.yelpcdn.com/bphoto/RWajPc...      False   
19  https://s3-media4.fl.yelpcdn.com/bphoto/JzUKbe...      False   

                                                  url  review_count  \
0   https://www.yelp.com/biz/donsak-thai-restauran...            97   
1   https://www.yelp.com/biz/toryumon-japanese-hou...           234   
2   https://www.yelp.com/biz/reren-washington-3?ad...           132   
3   https://www.yelp.com/biz/thai-and-time-again-w...            50   
4   https://www.yelp.com/biz/eerkins-uyghur-cuisin...           197   
5   https://www.yelp.com/biz/oki-bowl-at-georgetow...           290   
6   https://www.yelp.com/biz/the-happy-eatery-arli...           250   
7   https://www.yelp.com/biz/simply-banh-mi-washin...           657   
8   https://www.yelp.com/biz/han-palace-washington...            28   
9   https://www.yelp.com/biz/kusshi-glover-park-wa...            13   
10  https://www.yelp.com/biz/reren-lamen-n-bar-was...            45   
11  https://www.yelp.com/biz/saigon-noodles-and-gr...           193   
12  https://www.yelp.com/biz/seoulspice-arlington?...            64   
13  https://www.yelp.com/biz/billy-hicks-washingto...             4   
14  https://www.yelp.com/biz/han-palace-washington...           295   
15  https://www.yelp.com/biz/pho-75-arlington?adju...          2098   
16  https://www.yelp.com/biz/tnr-cafe-arlington-6?...           838   
17  https://www.yelp.com/biz/buuz-thai-eatery-arli...           294   
18  https://www.yelp.com/biz/momos-cafe-washington...           201   
19  https://www.yelp.com/biz/kappo-dc-washington?a...            88   

                                           categories  rating  \
0                [{'alias': 'thai', 'title': 'Thai'}]     4.4   
1   [{'alias': 'sushi', 'title': 'Sushi Bars'}, {'...     4.3   
2   [{'alias': 'asianfusion', 'title': 'Asian Fusi...     3.9   
3   [{'alias': 'thai', 'title': 'Thai'}, {'alias':...     4.6   
4   [{'alias': 'noodles', 'title': 'Noodles'}, {'a...     4.3   
5              [{'alias': 'ramen', 'title': 'Ramen'}]     3.9   
6   [{'alias': 'vietnamese', 'title': 'Vietnamese'...     4.1   
7   [{'alias': 'vietnamese', 'title': 'Vietnamese'...     4.5   
8   [{'alias': 'dimsum', 'title': 'Dim Sum'}, {'al...     3.6   
9   [{'alias': 'sushi', 'title': 'Sushi Bars'}, {'...     4.5   
10  [{'alias': 'ramen', 'title': 'Ramen'}, {'alias...     3.8   
11   [{'alias': 'vietnamese', 'title': 'Vietnamese'}]     3.8   
12  [{'alias': 'korean', 'title': 'Korean'}, {'ali...     4.4   
13  [{'alias': 'tradamerican', 'title': 'American'...     4.5   
14  [{'alias': 'dimsum', 'title': 'Dim Sum'}, {'al...     3.9   
15  [{'alias': 'vietnamese', 'title': 'Vietnamese'...     4.1   
16  [{'alias': 'chinese', 'title': 'Chinese'}, {'a...     3.8   
17  [{'alias': 'thai', 'title': 'Thai'}, {'alias':...     3.8   
18  [{'alias': 'asianfusion', 'title': 'Asian Fusi...     4.1   
19       [{'alias': 'japanese', 'title': 'Japanese'}]     4.5   

                                          coordinates  \
0      {'latitude': 38.92398, 'longitude': -77.05212}   
1   {'latitude': 38.89405977663849, 'longitude': -...   
2      {'latitude': 38.90468, 'longitude': -77.06262}   
3   {'latitude': 38.92376109199186, 'longitude': -...   
4    {'latitude': 38.921397, 'longitude': -77.072513}   
5      {'latitude': 38.91107, 'longitude': -77.06552}   
6    {'latitude': 38.896769, 'longitude': -77.071233}   
7   {'latitude': 38.9114651, 'longitude': -77.0656...   
8   {'latitude': 38.914425669693635, 'longitude': ...   
9    {'latitude': 38.920617, 'longitude': -77.071394}   
10     {'latitude': 38.90468, 'longitude': -77.06241}   
11  {'latitude': 38.89396413507029, 'longitude': -...   
12  {'latitude': 38.89580354043643, 'longitude': -...   
13  {'latitude': 38.9053016170736, 'longitude': -7...   
14     {'latitude': 38.92497, 'longitude': -77.05199}   
15  {'latitude': 38.8941969403826, 'longitude': -7...   
16     {'latitude': 38.89186, 'longitude': -77.08403}   
17  {'latitude': 38.8926182923642, 'longitude': -7...   
18      {'latitude': 38.91684, 'longitude': -77.0965}   
19  {'latitude': 38.91674042257224, 'longitude': -...   

                                  transactions  \
0                           [delivery, pickup]   
1   [restaurant_reservation, delivery, pickup]   
2                           [delivery, pickup]   
3                           [delivery, pickup]   
4                           [delivery, pickup]   
5                           [delivery, pickup]   
6                                   [delivery]   
7                           [delivery, pickup]   
8                           [delivery, pickup]   
9                           [delivery, pickup]   
10                                          []   
11                          [delivery, pickup]   
12                          [delivery, pickup]   
13                          [delivery, pickup]   
14                                    [pickup]   
15                                  [delivery]   
16                          [delivery, pickup]   
17                          [delivery, pickup]   
18                          [delivery, pickup]   
19                                          []   

                                             location         phone  \
0   {'address1': '2608 Connecticut Ave NW', 'addre...  +12025078207   
1   {'address1': '1650 Wilson Blvd', 'address2': '...  +15713571537   
2   {'address1': '1073 Wisconsin Ave NW', 'address...  +12028044962   
3   {'address1': '2311 Calvert St NW', 'address2':...  +12025061076   
4   {'address1': '2412 Wisconsin Ave NW', 'address...  +12023333600   
5   {'address1': '1608 Wisconsin Ave NW', 'address...  +12029448660   
6   {'address1': '1800 N Lynn St', 'address2': Non...  +15718001881   
7   {'address1': '1624 Wisconsin Ave NW', 'address...  +12023335726   
8   {'address1': '1728 Wisconsin Ave NW', 'address...  +12023556725   
9   {'address1': '2309 Wisconsin Ave', 'address2':...  +12023333986   
10  {'address1': '1073 Wisconsin Ave NW', 'address...  +12024506654   
11  {'address1': '1800 Wilson Blvd', 'address2': '...  +17035665940   
12  {'address1': '1735 N Lynn St', 'address2': 'St...  +17034195868   
13  {'address1': '3277 M St NW', 'address2': '', '...  +12027925757   
14  {'address1': '2649 Connecticut Ave NW', 'addre...  +12029690018   
15  {'address1': '1721 Wilson Blvd', 'address2': N...  +17035257355   
16  {'address1': '2049 Wilson Blvd', 'address2': N...  +17038750428   
17  {'address1': '1926 Wilson Blvd', 'address2': N...  +17038753999   
18  {'address1': '4828 MacArthur Blvd NW', 'addres...  +12023333675   
19  {'address1': '4822 MacArthur Blvd NW', 'addres...  +12028859086   

     display_phone     distance price  
0   (202) 507-8207  2610.575513   NaN  
1   (571) 357-1537  1700.481085    $$  
2   (202) 804-4962  1231.376901    $$  
3   (202) 506-1076  2668.901781   NaN  
4   (202) 333-3600  1372.544960    $$  
5   (202) 944-8660   893.707029    $$  
6   (571) 800-1881  1454.673415    $$  
7   (202) 333-5726   892.543637     $  
8   (202) 355-6725   916.875174   NaN  
9   (202) 333-3986  1312.481131   NaN  
10  (202) 450-6654  1240.359423   NaN  
11  (703) 566-5940  1724.920728    $$  
12  (703) 419-5868  1558.617510   NaN  
13  (202) 792-5757   985.644354   NaN  
14  (202) 969-0018  2687.396667    $$  
15  (703) 525-7355  1699.743009    $$  
16  (703) 875-0428  2060.345847    $$  
17  (703) 875-3999  1918.469858    $$  
18  (202) 333-3675  1996.443214    $$  
19  (202) 885-9086  1981.105305  $$$$

Approach 2: write a function to collect the information you need¶

Assume you are interested in the id, name, url, lat and long, and rating

# function to clean and extract information from yelp
def clean_yelp(yelp_json):
    '''
    function to extract columns of interest from yelp json
    '''
    # create a temporary dictionary to store the information
    temp_yelp = {}
    
    # collect information
    temp_yelp["id"]= yelp_json["id"]
    temp_yelp["name"]= yelp_json["name"]
    temp_yelp["url"]= yelp_json["url"]
    temp_yelp["latitude"] = yelp_json["coordinates"]["latitude"]
    temp_yelp["longitude"] = yelp_json["coordinates"]["longitude"]
    temp_yelp["rating"]= yelp_json["rating"]
    
    # return
    
    return(temp_yelp)

# apply to the dictionary
results_yelp = [clean_yelp(entry) for entry in yelp_json["businesses"]]

# Convert results to dataframe
yelp_df = pd.DataFrame(results_yelp)   
print(yelp_df)

                        id                     name  \
0   XKOVFGUCK1e0vZBML4ddxw  Donsak Thai Restaurant    
1   SPWt2Gqb2-alIq78YINs-w  Toryumon Japanese House   
2   eV_87BqGbpvTqUwjOgQO5g                    Reren   
3   DB9hhm2cB9Iu88RQw6aqCQ      Thai And Time Again   
4   RiwIUBITUfhn6etk6qVbnQ  Eerkin's Uyghur Cuisine   
5   QanUICteMAzlK7jVADa1JA   OKI bowl at Georgetown   
6   M9MhHHorL39Gf-Mz7N9oEA         The Happy Eatery   
7   iHhrBAMa833_hkYfZ5fDoQ           Simply Banh Mi   
8   MmgIn8Ufynn8v6lzgLrX-A               Han Palace   
9   ws2OEuBG41rB5LOq_OE_qg     Kusshi - Glover Park   
10  7cGsqMdKLNV-eXItawA6mA        Reren Lamen n Bar   
11  dIMNqrq_vOHpLcOdcPCVpA   Saigon Noodles & Grill   
12  D14Ucgt6SytND1-YwoQVZg               SeoulSpice   
13  gwq-QIb-gxNRVAVRuRhLAQ              Billy Hicks   
14  xmVrPFMaJ5ko3o1wxjnIZg               Han Palace   
15  Ek_-kvajIvVJbi3ll4pMww                   Pho 75   
16  nIgDD3Gtq8Ya1ddJ1w1IMQ                 TNR Cafe   
17  vzvuB2t9-t3kqtINjjhPig         Buuz Thai Eatery   
18  qGO0rs-uNANLexpMZldTtA              Momo's Cafe   
19  6MOlZLcCA22fhbGecugyhA                 Kappo DC   

                                                  url   latitude  longitude  \
0   https://www.yelp.com/biz/donsak-thai-restauran...  38.923980 -77.052120   
1   https://www.yelp.com/biz/toryumon-japanese-hou...  38.894060 -77.077633   
2   https://www.yelp.com/biz/reren-washington-3?ad...  38.904680 -77.062620   
3   https://www.yelp.com/biz/thai-and-time-again-w...  38.923761 -77.050982   
4   https://www.yelp.com/biz/eerkins-uyghur-cuisin...  38.921397 -77.072513   
5   https://www.yelp.com/biz/oki-bowl-at-georgetow...  38.911070 -77.065520   
6   https://www.yelp.com/biz/the-happy-eatery-arli...  38.896769 -77.071233   
7   https://www.yelp.com/biz/simply-banh-mi-washin...  38.911465 -77.065650   
8   https://www.yelp.com/biz/han-palace-washington...  38.914426 -77.067304   
9   https://www.yelp.com/biz/kusshi-glover-park-wa...  38.920617 -77.071394   
10  https://www.yelp.com/biz/reren-lamen-n-bar-was...  38.904680 -77.062410   
11  https://www.yelp.com/biz/saigon-noodles-and-gr...  38.893964 -77.078829   
12  https://www.yelp.com/biz/seoulspice-arlington?...  38.895804 -77.070566   
13  https://www.yelp.com/biz/billy-hicks-washingto...  38.905302 -77.065385   
14  https://www.yelp.com/biz/han-palace-washington...  38.924970 -77.051990   
15  https://www.yelp.com/biz/pho-75-arlington?adju...  38.894197 -77.078854   
16  https://www.yelp.com/biz/tnr-cafe-arlington-6?...  38.891860 -77.084030   
17  https://www.yelp.com/biz/buuz-thai-eatery-arli...  38.892618 -77.081383   
18  https://www.yelp.com/biz/momos-cafe-washington...  38.916840 -77.096500   
19  https://www.yelp.com/biz/kappo-dc-washington?a...  38.916740 -77.096358   

    rating  
0      4.4  
1      4.3  
2      3.9  
3      4.6  
4      4.3  
5      3.9  
6      4.1  
7      4.5  
8      3.6  
9      4.5  
10     3.8  
11     3.8  
12     4.4  
13     4.5  
14     3.9  
15     4.1  
16     3.8  
17     3.8  
18     4.1  
19     4.5

Save the json¶

Remember to always save your response from the API call. You don't want be querying the API all the time to grab the same data.

import json

with open("yelp_results.json", 'w') as f:
    # write the dictionary to a string
    json.dump(response.json(), f, indent=4)

Practice¶

Make a successful query using your favorite type of food to the Yelp API. Pretty much I only want you to repeat what we did before, but changing the search term a bit

# code here

Example 3 : YouTube API¶

Now let's move to our last example.

We will be working with the YouTube API. This is a complex API, but lucky for us some other programmers already created a Python wrapper to access the API. We will use the youtube-data-api library which contains a set of functions to facilitate the access to the API.

What kind of data can you get from the Youtube API?¶

Youtube has a very extensive api. There are a lot of data you can get access to. See a compreensive list here

What is included in the package:

video metadata
channel metadata
playlist metadata
subscription metadata
featured channel metadata
comment metadata
search results

How to Install¶

The software is on PyPI, so you can download it via pip

#!pip install youtube-data-api

How to get an API key¶

A quick guide: https://developers.google.com/youtube/v3/getting-started ¶

You need a Google Account to access the Google API Console, request an API key, and register your application. You can use your GMail account for this if you have one.
Create a project in the Google Developers Console and obtain authorization credentials so your application can submit API requests.
After creating your project, make sure the YouTube Data API is one of the services that your application is registered to use.

a. Go to the API Console and select the project that you just registered.

b. Visit the Enabled APIs page. In the list of APIs, make sure the status is ON for the YouTube Data API v3. You do not need to enable OAuth 2.0 since there are no methods in the package that require it.

# call some libraries
import os
import datetime
import pandas as pd

#Import YouTubeDataAPI
from youtube_api import YouTubeDataAPI
from youtube_api.youtube_api_utils import *
from dotenv import load_dotenv

# load keys from  environmental var
load_dotenv() # .env file in cwd
api_key = os.environ.get("YT_KEY")

# create a client 
# this is what we call: instantiate the class
yt = YouTubeDataAPI(api_key)
print(yt)

<youtube_api.youtube_api.YouTubeDataAPI object at 0x284754350>

Starting with a channel name and getting some basic metadata¶

Let's start with the LastWeekTonight channel

https://www.youtube.com/user/LastWeekTonight

First we need to get the channel id

channel_id = yt.get_channel_id_from_user('LastWeekTonight')
print(channel_id)

UC3XTzVzaHQEd30rQbuvCtTQ

Channel metadata¶

# collect metadata
yt.get_channel_metadata(channel_id)

{'channel_id': 'UC3XTzVzaHQEd30rQbuvCtTQ',
 'title': 'LastWeekTonight',
 'account_creation_date': 1395178899.0,
 'keywords': None,
 'description': 'Breaking news on a weekly basis. Sundays at 11PM - only on HBO.\nSubscribe to the Last Week Tonight channel for the latest videos from John Oliver and the LWT team.',
 'view_count': '4027985075',
 'video_count': '672',
 'subscription_count': '9620000',
 'playlist_id_likes': '',
 'playlist_id_uploads': 'UU3XTzVzaHQEd30rQbuvCtTQ',
 'topic_ids': 'https://en.wikipedia.org/wiki/Politics|https://en.wikipedia.org/wiki/Entertainment|https://en.wikipedia.org/wiki/Society|https://en.wikipedia.org/wiki/Television_program',
 'country': None,
 'collection_date': datetime.datetime(2024, 11, 4, 20, 14, 55, 356998)}

Subscriptions of the channel.¶

pd.DataFrame(yt.get_subscriptions(channel_id))

List of videos of the channel¶

You first need to convert the channel_id into a playlist id to get all the videos ever posted by a channel using a function from the youtube_api_utils in the package. Then you can get the video ids, and collect metadata, comments, among many others.

from youtube_api.youtube_api_utils import *
playlist_id = get_upload_playlist_id(channel_id)
print(playlist_id)

UU3XTzVzaHQEd30rQbuvCtTQ

## Get video ids
videos = yt.get_videos_from_playlist_id(playlist_id)
df = pd.DataFrame(videos)
print(df)

        video_id                channel_id  publish_date  \
0    tWZAbKU-JzE  UC3XTzVzaHQEd30rQbuvCtTQ  1.730723e+09   
1    t5lVEMpfRDI  UC3XTzVzaHQEd30rQbuvCtTQ  1.730711e+09   
2    P6grAoS-muM  UC3XTzVzaHQEd30rQbuvCtTQ  1.730387e+09   
3    esymd1F1cRY  UC3XTzVzaHQEd30rQbuvCtTQ  1.730099e+09   
4    qXiEGPWVjGU  UC3XTzVzaHQEd30rQbuvCtTQ  1.729494e+09   
..           ...                       ...           ...   
667  Dh9munYYoqQ  UC3XTzVzaHQEd30rQbuvCtTQ  1.398670e+09   
668  k8lJ85pfb_E  UC3XTzVzaHQEd30rQbuvCtTQ  1.398669e+09   
669  WHCQndalv94  UC3XTzVzaHQEd30rQbuvCtTQ  1.398663e+09   
670  8q7esuODnQI  UC3XTzVzaHQEd30rQbuvCtTQ  1.395379e+09   
671  gdQCtWlhx90  UC3XTzVzaHQEd30rQbuvCtTQ  1.395379e+09   

               collection_date  
0   2024-11-04 20:15:09.063314  
1   2024-11-04 20:15:09.063342  
2   2024-11-04 20:15:09.063362  
3   2024-11-04 20:15:09.063381  
4   2024-11-04 20:15:09.063400  
..                         ...  
667 2024-11-04 20:15:10.515060  
668 2024-11-04 20:15:10.515078  
669 2024-11-04 20:15:10.515103  
670 2024-11-04 20:15:10.515179  
671 2024-11-04 20:15:10.515216  

[672 rows x 4 columns]

Collect video metadata¶

# id for videos as a list
df.video_id.tolist()

['tWZAbKU-JzE',
 't5lVEMpfRDI',
 'P6grAoS-muM',
 'esymd1F1cRY',
 'qXiEGPWVjGU',
 'wDSlvMp5knk',
 't-ZXW-KLt7A',
 'yI15h4s6ppM',
 'lDTNFxfOHiM',
 'nD2p1vHuLDw',
 'njk_Po2K4EE',
 'hm42viKN_9Q',
 'bIUXSR0yRmE',
 'VP3gcfls0No',
 'ViaSEoqPR9U',
 'X-b00wd7YIs',
 'TkMopKZgGKw',
 'VSn3c7twkw8',
 'fnOOV6vcAv0',
 'bA0x7w_tbZ4',
 'I5bJX-TFd7k',
 'ObXfs3Unxhg',
 'Ch9--kBhVOs',
 'NsFTuKb3eqQ',
 'BMpkSex3I7E',
 '9PfTIFMVhSQ',
 'Fiw7KGfupyM',
 'HwezhqnDZoo',
 'CFASOVzjfAY',
 '9JwEBiZMWrM',
 '9Mm57Mts9Ao',
 '9shpDiix8b4',
 '2ExOsj2-w10',
 '2s-mc4Ro9Bs',
 '6STdru43q0A',
 'CkK3W0lOKcc',
 'QAbDif3Rq0M',
 'E8ygQ2wEwJw',
 'lG_rsKe9YVc',
 'j3w8-d_fnqE',
 'pJiUPYNfGyI',
 'hq2s7RMRsgs',
 '_hIOdiYYSnc',
 'vlZtWCKLA9Y',
 'pDpjyf-6oI8',
 'q1bb2ZljRtc',
 'yD3M3DYGQIk',
 'rBxhqRqdP-8',
 'iQNr5dnK7Y4',
 'f0Qxe1kBwKY',
 'frhBpkwTU84',
 'gJcybtVuC6k',
 'YuysiivHH0g',
 'd9ueNLAwFU0',
 'VW-KWUrV7yU',
 'Ud5V7MOXS9k',
 'Wn5sK9B5Gwg',
 'RKtjZexepc0',
 'dl8rM5rKv0U',
 'OjYL7oTeCNU',
 'H09I2x4EOYI',
 'N-qsKNRSb4g',
 'BWQOiQ7FQyY',
 'ENuNwywzsco',
 'E6frBeVCMpI',
 'FLkg5nPG5qg',
 'BUfJuc9rxhQ',
 '3ikJEZggcp8',
 '2REhjrGp8FQ',
 '2Y3nNEqVXsM',
 '57bhj1O9s4Y',
 '0aEZwp0QYrA',
 '18IT4V0295E',
 'WLMdPgWVCfs',
 '-YypArYDcjA',
 '3UCqtnr-pF8',
 'wKobMz1bbo8',
 'sGChECfK71g',
 'rwF6FPgJTyk',
 'sTA-T1u7zAg',
 'tcMvRC9mczc',
 'qHKu1zMriyc',
 'sngVjVPleTk',
 'nv3UExlRO8k',
 'r7-AbyFrbiw',
 'ihjo83ny6fA',
 'mGrytEhEWGE',
 'PN8QufaS0Ug',
 'j_krJ0K83BM',
 'NpWH7EB9bg0',
 'ZYjJ2-UsbSQ',
 'XeZVGB6uOL4',
 'DWyuSC21LTM',
 'GSK-sCDBdK0',
 'Lkz3s6QVKPQ',
 'JfvWnFbClp0',
 'U34RWZ4IjvE',
 'DBT4Naaqo1s',
 'MESS6iefA1E',
 '7iivkoW98Qs',
 'AFdwGYHqN04',
 '5vdVQnvYjjY',
 '5XtkBngQHn4',
 '4wvnSStEK9M',
 '5DgEVxqtJOI',
 '-98oRTd6MsU',
 'j41q1GutH4M',
 'u2ii0DCREzA',
 'vNKsfXfyRrk',
 'j8DxdibHibU',
 'nhXAmbgTSyI',
 '1gUP_43J7wY',
 '7TNAyvx-ymI',
 'NqK3_n6pdDY',
 'LCXdbtlY9WM',
 'axsgzg3RyF0',
 '5juExgl1OLY',
 'qcIHauGxOTU',
 'vZcQPZq8hkQ',
 'z3-Jc2fPjS8',
 'xyhyOgDKiFs',
 's698Ee33J7c',
 'i0QeW9iTHPA',
 'ufwK63Ovzjs',
 'kkfX1mpsMKk',
 'T9DJj6tKdM4',
 'jS9zYYM7x40',
 'n7ldLYSEiwk',
 'SoKWfZF9tik',
 'Y3dGKVyCqxs',
 'b8oKcBu_OYs',
 'NnoGHlSEiJM',
 'Ti_Gwa_TFhQ',
 'UoGvlEtNQmI',
 'SmbQ-g492wc',
 'ULghbKUykQ0',
 'ULz9AcPnIQI',
 'T3c9m7DYruI',
 'MuoDWUDMgtI',
 'HDByNV39GrA',
 'AEt1C8rev3I',
 'CzHd9FdUHjg',
 '9tex6qPDNeQ',
 'CMq8aMZJmTA',
 '6am_LubDVCQ',
 '2d9uagP9oeo',
 '2lU8l5oxJ8w',
 '63dp_pacFZ0',
 'tkAqwHiAR-g',
 'NP8IWBOmc_g',
 'gYwqpx6lp_s',
 'QCRySbsLKiA',
 'qW7CGTK-1vA',
 'pkVQzk9qGHM',
 'JTwaXIUepdk',
 'oDJWD7021HI',
 'x1yD7FzXtec',
 'xVDfkeiQx3M',
 'z3YtQleycpg',
 'tzBcGN-cpHk',
 'x71eHoVzV7g',
 'uRKIxgsi0pU',
 'iHHkJKtYYnc',
 'nD2oRKAnKB8',
 'r5U_74z_Nxs',
 'rKVqK6sKuSs',
 'iOzLGthFmHE',
 'jgY60jynMO4',
 'n5C139Y3hTo',
 'SgTQDp1jwBw',
 'b1_CNmOy914',
 'eElaGUsVIHs',
 'X372Jg1d2G0',
 'F9-p2zVI8yw',
 'QgND0iPRrlo',
 'JiZp1EyoEYM',
 'OSwXWxMCMPY',
 'C6lbW6JUYM8',
 'InSiKiGsoeQ',
 'LdA46CTeaYo',
 'Hqz1LFs2JA4',
 'GA9ZidJHiWI',
 'EEwVKqHcUfk',
 '4Vn4yksR4LI',
 '8FqEJf6l5IQ',
 '0GkhtkoBhf4',
 '1psG_N9Wzfk',
 '3EsaIt9TAEQ',
 '-FGcescG_yQ',
 '-_HnshzZfFM',
 '0Jjl8P8AXdw',
 'yb9fMmDod40',
 'MI78WOW_u-Q',
 '93F616Cmzr8',
 'Io0yuH1CiA0',
 'KiOc61C4518',
 '42xZB80sZaI',
 '9Eo7ioe5Xfo',
 'vfpWf5pzLdM',
 'tLrn3A7HxIs',
 'wdlT2d79N6s',
 'hmt816o8R7c',
 'RaMRa8E_Le0',
 'SeyOlaHz0aU',
 'h6Em04qAR6M',
 'l9GtiuB4D2o',
 'iGpyTMdAfWE',
 '_Z_CNYuNxz0',
 'PCe2c4rHtmg',
 'MJ8cH0i9OGI',
 'FuWMVAMLNnw',
 'KLhdWhGLrd0',
 'FVPWL6A6ZM4',
 'HJAcmSbtURo',
 '6fChicI6n9g',
 'CMMalyy0Qlw',
 'GEQXe4YWFxw',
 'ARslZU4_uME',
 '36A8v5hsvok',
 '3UL9AaxA9pY',
 '6ihHEQNLkeQ',
 'aPtCYRuJMKk',
 'i1KnSyXjE5Q',
 'zRdhoYqCAQg',
 'PHKICYdzRW0',
 'bVIsnOfNfCo',
 'lvXq2cq7yGQ',
 'SOn3wba8c-Y',
 'WKaLePvHsco',
 'aFsfJYWpqII',
 'VOhHKYRXXhs',
 'zN2_0WC7UfU',
 'DNVwnkgTzbY',
 'jVIYbgVks7E',
 'f4RlTpgGOps',
 'Q8oCilY4szc',
 'G-dJMqpGvjM',
 'pLPpl2ISKTg',
 '7CkZTHQJ0RI',
 'PbzW9qcVBJ0',
 '_m4JMTixTTM',
 'GE-VJrdHMug',
 'Vjc782yvwAk',
 '3v6y2pY1pZ0',
 'Eo3zORUGCbM',
 'AJ2keSJzYyY',
 'Tn7egDQ9lPg',
 'p4QGOHahiVM',
 'pJ9PKQbkJv8',
 '6eH2BItdo0M',
 'FwHMDjc7qJ8',
 'AiOUojVd6xQ',
 'Za45bT41sXg',
 'lzsZP9o7SlI',
 '82QYlbiawJI',
 '18PL6enCwh8',
 'sy5VQvDGKd4',
 'o7zazuy_UfI',
 '41vETgarh_8',
 'qrizmAo17Os',
 '_uSZwErdH3I',
 'Bd2bbHoVQSM',
 'wJDk-czsivk',
 'M81-GM0mTc4',
 'Sqa8Zo2XWc4',
 'a546lxxJIhE',
 's3gUpyEI_rQ',
 'HkvQywg_uZA',
 'UMqLDhl8PXw',
 'KWterDbJKjY',
 'Y0LA7Ff2hgs',
 'xQLqIWbc9VM',
 'Ns8NvPPHX5Y',
 'kCOnGjvYKI0',
 'eJPLiT1kCSM',
 'uySgklnlX3Y',
 'DNy6F7ZwX8I',
 '3YNku5FKWjw',
 '6p8zAbFKpW0',
 'x2hw_ghPcQs',
 'pQcFCFZIuZI',
 'jtIZZs-GAOA',
 'MBo4GViDxzc',
 '6RxqNv6bEug',
 'jtxew5XUVbQ',
 'L4qmDnYli2E',
 'jXf04bhcjbg',
 'KgwqQGvYt0g',
 'AEa3sK1iZxc',
 'jDdYFhzVCDM',
 'C-YRSqaPtMg',
 'FtdVglihDok',
 'MalsOLSFvX0',
 '-v0XiUQlRLw',
 'Hk011WMM7t0',
 'obCNQ0xksZ4',
 'wqn3gR1WTcA',
 'phieTCxQRLA',
 'RMpCGD7b_H4',
 '-_Y7uqqEFnY',
 'kpYYdCzTpps',
 '-gd8yUptg0Q',
 'EICp1vGlh_U',
 'xX5IV9n223M',
 '8Kfx2fANELo',
 'Gk8dUXRpoy8',
 'qBpiXcyB7wU',
 'liptMbjF3EE',
 '9Y18-07g39g',
 '0nqJvjUNlRA',
 'l5jtFqWq5iU',
 '9W74aeuqsiU',
 'bl-ABuxeWrE',
 'EN9OdruH_qM',
 '27FpoRiStgk',
 'NvpKES_kcYg',
 'dykZyuWci3g',
 'WqD-ATqw3js',
 'uaCaIhfETsM',
 'Ezv8sdTLxKo',
 '_-0J49_9lwc',
 'A51mJjFyG_w',
 'aSZ-hogD8mg',
 'oFetFqrVBNc',
 'zv8ZPFOxJEc',
 '6fiRDJLjL94',
 '29lXsOYBaow',
 'abuQk-6M5R4',
 'sIi_QS1tdFM',
 'vTF-Kz_7L0c',
 'Uf1c0tEGfrU',
 'gPHgRp70H8o',
 'Of3fbIgSqeU',
 'GzFG0Cdh8D8',
 '2xlol-SNQRU',
 'yq_E3HquRJY',
 'Fiu9GSOmt8E',
 'XMGxxRRtmHc',
 'jm9YKT0dItk',
 'WYdi1bL6s10',
 'IhO1FcjDMV4',
 '_v-U3K1sw9U',
 'S5_4wPW6jJQ',
 'qIUb3bjh42Y',
 'EzlCOg-37hI',
 'cMz_sTgoydQ',
 'LyC855KdBKo',
 'sE63HmOYGps',
 'IuVo4fnpLC8',
 'xtdU5RPDZqI',
 '7g0Jh4h5E1E',
 'AytDzZ2ecCc',
 'zeBjxv4oqGU',
 'pkpfFuiZkcs',
 'rBu0BRTx2x8',
 '3ZRE6uVMDAo',
 '1f2iawp0y5Y',
 'hsxukOPEdgg',
 '17oCQakzIl8',
 '0b_eHBZLM6U',
 'R652nwUcJRA',
 'MuxnH0VAkAM',
 'jZjmlJPJgug',
 'Wf4cea5oObY',
 'l-nEHkgm_Gk',
 'z4gBMw64aqk',
 'IoL8g0W9gAQ',
 '7rl4c-jr7g0',
 'dRFbwjwQ4VE',
 '6s4Bx7mzNkM',
 'UnSILVWDKL8',
 'ElIf2DBrWzU',
 '_066dEkycr4',
 'v_kak7kAdNw',
 'c09m5f7Gnic',
 'qVIXUhZ2AWs',
 '7Z2XRg3dy9k',
 'kxatzHnl7Q8',
 '_TfCgeYHiBE',
 'xa0oY7LQmtg',
 '1aheRpmurAo',
 'UN8bJb8biZU',
 'svEuG_ekNT0',
 '2u_pZ-SgACk',
 'qMGn9T37eR8',
 '1Nqa4XKkXp4',
 'SE_ccFHjL_w',
 'Nuzi7LlSDVo',
 'tXqnRMU1fTs',
 '3y1QA6OeAcQ',
 'TATSAHJKRd8',
 '-9QYu8LtH2E',
 'AjqaNQ018zU',
 'dXyO_MC9g3k',
 'd9m7d07k22A',
 'Bchx0mS7XOY',
 'zxT8CM8XntA',
 'bCBYJZ6QbUI',
 '-tIdzNlExrw',
 'hnoMsftQPY8',
 'JDcro7dPqpA',
 '0lTczPEG8iI',
 'f4fVdf4pNEc',
 'YMBj_tU7HRU',
 '-qCKR6wy94U',
 'jCC8fPQOaxU',
 'm8UQ4O7UiDs',
 'Yq7Eh6JTKIg',
 'FO0iG_P0P6M',
 '_h1ooyyFkF0',
 'WhMGcp9xIhY',
 'HaBQfSAVt0s',
 'CdDBi0DheMw',
 'MdHmp5EX5bE',
 'TB_wx0dAPU0',
 'ximgPmJ9A5s',
 '5HS2TstPfW4',
 'ygVX1z6tDGI',
 'UpdMYOtAmKY',
 'ViDPIyiszoo',
 'FsZ3p9gOkpY',
 'opi8X9hQ7q8',
 'OjPYmEZxACM',
 'NpPyLcQ2vdI',
 '2nXYbGmF3_Q',
 'etkd57lPfPU',
 'Fmh4RdIwswE',
 'ET_b78GSBUs',
 'dHiAls8loz4',
 'dFnN2toxFaY',
 'AJm8PeWkiEU',
 '8-hahRWhFvg',
 'OubM8bD9kck',
 'mOVPStnVgvU',
 'nG2pEffLEJo',
 'hWQiXv0sn9Y',
 'IYfgvS0FA7U',
 'mXQuto1fMp4',
 '5xnZ_CeTqyM',
 'RKjk0ECXjiQ',
 '4NNpkv3Us1I',
 '9fB0GBwJ2QA',
 'rs2RlZQVXBU',
 'g6iDZspbRMg',
 'LEcbagW4O-s',
 'LdhQzXHYLZ4',
 'QCjk_NPsIqU',
 'wrpeEitIEpA',
 'seGgZp-XYdM',
 '1ZAPwfrtAFY',
 '8bl19RoR7lc',
 'pf1t7cs9dkc',
 'mPjgRKW_Jmk',
 'J5b_-TZwQ0I',
 '_ECYMvjU52E',
 'ScmJvmzDcG0',
 '00wQYmvfhn4',
 '1ZNZY-gd3K0',
 'ZwY2E0hjGuU',
 'TrS0uNBuG9c',
 'NnW5EjwtE2U',
 'WyGq6cjcc3Q',
 '5cBV8KFFasY',
 'GvtNyOzGogc',
 '7VG_s2PCH_c',
 'aw6RsUhw1Q8',
 'fyVz5vgqBhE',
 '5scez5dqtAc',
 'FVFdsl29s_Q',
 'yw_nqzVfxFQ',
 'qI5y-_sqJT0',
 '92vuuZt7wak',
 'wD8AwgO0AQI',
 'hkZir1L7fSY',
 'A-4dIImaodQ',
 'BcR_Wg42dv8',
 'ySTQk6updjQ',
 '-Z668Qc0P4Q',
 'Ifi9M7DRazI',
 'bLY45o6rHm0',
 'YEGpriv2TAc',
 '0utzB6oDan0',
 'xecEV4dSAXE',
 'ekoETowzmAo',
 'cBUeipXFisQ',
 '-rSDUsMwakI',
 's6MwGeOm8iI',
 'Cy-O4myeUzg',
 'e0bMfS-_pjM',
 'o8yiYCHMAlM',
 '5pdPrQFjo2o',
 'k3O01EfM5fU',
 'KEbFtMgGhPY',
 'zaD84DTGULo',
 'h1Lfd1aB9YI',
 '_kZsOISarzg',
 '8l2Y6Z-maAU',
 'apumpVGBpP8',
 'XzCgVltuzEk',
 'l_htSPGAY7I',
 '4U2eDJnwz_s',
 '7-LPcVo7gC0',
 'bq2_wSsDwkQ',
 'BUCnjlTfXDw',
 '32n4h0kn-88',
 'zNdkrtfZP8I',
 'voxfqkrO5ww',
 'cQtYlkwR8lM',
 'IQwMCQFgQgo',
 'nh0ac5HUpDU',
 'BgyqAD5Z6_A',
 'iAgKHSNqxa8',
 'gvZSpET11ZY',
 'hxUAntt1z2c',
 'NcA_j23HuDU',
 '_S2G8jhhUHg',
 'A-XlyB_QQYs',
 '0Rnq1NpHdmw',
 'o5E7cG54VoA',
 'Tt-mpuR_QHQ',
 'GUizvEjR-0U',
 'aRrDsbUdY_k',
 'Ylomy1Aw9Hk',
 'fNS4lecOaAc',
 'vU8dCYocuyI',
 'dNV7COWz8ME',
 'zsjZ2r9Ygzw',
 '3saU5racsGE',
 'DnpO_RTSNmQ',
 'XebG4TO_xss',
 'DRauXXz6t0Y',
 'rHFOwlMCdto',
 'DgOgdGpWqzQ',
 'f0X-8tSgiuA',
 'pxM3tvHowaM',
 'GjatG8QFoOk',
 '_tyszHg96KI',
 'Mq785nJ0FXQ',
 'gJtYRxH5G2k',
 '5d3nASKtGas',
 '0V5ckcTSYu8',
 'jYusNNldesc',
 'NGY6DqB1HX8',
 'umqvYhb3wf4',
 'USkEzLuzmZ4',
 'CQ2noSR1qdY',
 'ACwenVzN2oU',
 '5d667Bb_iYA',
 '7y1xJAVZxXg',
 'L0jQz6jqQS0',
 '4Z4j2CrJRn4',
 'pDVmldTurqk',
 'i8xwLWb0lLY',
 'xcwJt4bcnXs',
 'Pk2oW4SDDxY',
 'EjpJqcM3X94',
 'U2WlQZf9zSg',
 'hmoAX9f6MOc',
 'PuNIwYsz7PI',
 'zmeF2rzsZSU',
 'IS5mwymTIJU',
 'qr6ar3xJL_Q',
 'W_gRZcI1lto',
 'X9wHzt6gBgI',
 'zIhKAQX5izw',
 'J6lyURyVz7k',
 'VdLf4fihP78',
 'UC_gXD5OE88',
 '3bxcc3SM_KA',
 'Nn_Zln_4pA8',
 'yzGzB-yYKcc',
 'fesi92d1p68',
 'XEVlyP4_11M',
 'uiN_-AEhTpk',
 'kXYXuXX48m8',
 '0UjpmT5noto',
 'pX8BXH3SJn0',
 'br0NW9ufUUw',
 'CesHr99ezWE',
 'Wpzvaqypav8',
 'poL7l-Uk3I8',
 '2sWRXr2Yu9g',
 '6UsHHOCH4q8',
 'l8QNDRbjong',
 'YQZ2UeOTO3I',
 '3FCioWz7aps',
 '2bbskco60g4',
 'fPloDzu_wcI',
 'P1EtSBxm0S4',
 'eAFnby2184o',
 'X8Buy2X0kFo',
 '9PK-netuhHA',
 'l9qA8c-E_oA',
 'x3Md3O_XfoI',
 'aIMgfBZrrZ8',
 'boI4D1FlIVs',
 'XXCbffp7jLM',
 'MepXBJjsNxs',
 'izUzqUrhbh0',
 'QplQL5eAxlY',
 'fJ9prhPV2PI',
 'tug71xZL7yc',
 'eKEwL-10s7E',
 'DeQqe0oj5Ls',
 '_8m8cQI4DgM',
 '3kEpZWGgJks',
 'ResvfWhi3k8',
 'K4NRJoCNHIs',
 'hkYzuHMcP64',
 'oDPCmmZifE8',
 'xM8qVuc32Rc',
 '-YkLPxQp_y0',
 'xAnw2atT628',
 'HNPRad65-Kg',
 'P8pjd1QEA0c',
 'InSJBzZr414',
 'knbw0gJHHBk',
 'PsB1e-1BB4Y',
 'fYWtbMb8Fhw',
 'KUdHIatS36A',
 'PDylgzybWAw',
 'E_F5GxCwizc',
 '44fCfJQV7yQ',
 'iFaRkscKn_Y',
 'b436uUuf_VI',
 '1Y1ya-yF35g',
 'HKMNKS-9ugY',
 '_Pz3syET3DY',
 'LfgSEwjAeno',
 'rrawNvcF64g',
 'dH573B1bkHI',
 '6clJRsPyuhc',
 'zSQCH1qyIDo',
 'G2W41pvvZs0',
 'QJkiWwMKwSo',
 'WA0wKeokWUU',
 'AJKfs4ZnbNE',
 'hkjkQ-wCZ5A',
 'OPV3D7f3bHY',
 'T8y5EXFMD4s',
 '4Otsft59HuA',
 '3lKYPp2Kp6s',
 'DlJEt2KU33I',
 'NUkjfd95bDM',
 'c3IaKVmkXuk',
 'fpbOEoRrHyU',
 'fDTX_mwj0_8',
 'w2jtOFfiF80',
 'r-ERajkMXw0',
 'EZdH94R6XwQ',
 'XV5w8OFrm6U',
 'IAR3cb1V_ss',
 'j6IZ2TroruU',
 '8YQ_HGvrHEU',
 'We1IvUe6KLo',
 'cjuGCJJUGsg',
 'LxQq8HTtb4A',
 'UkBvsCMxrNU',
 '6GLuuqrNqxQ',
 'Kye2oX-b39E',
 '75bWsqwwOgQ',
 'nJ24vcyJxDs',
 'W6JhcjbWEwg',
 'Bml8KwCmob8',
 'yCY6vYGEN3Q',
 '9vxT1uAhFxA',
 'mxyKJisUgJo',
 'Dh9munYYoqQ',
 'k8lJ85pfb_E',
 'WHCQndalv94',
 '8q7esuODnQI',
 'gdQCtWlhx90']

#grab metadata
video_meta = yt.get_video_metadata(df.video_id.tolist()[:5])

#visualize
pd.DataFrame(video_meta)

## Collect Comments
ids = df.video_id.tolist()[:5]

ids

['tWZAbKU-JzE', 't5lVEMpfRDI', 'P6grAoS-muM', 'esymd1F1cRY', 'qXiEGPWVjGU']

# loop
list_comments = []

for video_id in ids:
    comments = yt.get_video_comments(video_id, max_results=10)
    list_comments.append(pd.DataFrame(comments))

# concat
df = pd.concat(list_comments)
df.head()

Search¶

The youtube API also allows you to search for most popular videos using queries. This is very cool!

df = pd.DataFrame(yt.search(q='urnas fraude', max_results=10))
df.keys()
df[["channel_title", "video_title"]]

Some cool research using the Youtube API:

Lei et al, Estimating the Ideology of Political YouTube Videos
Brown et al, Echo Chambers, Rabbit Holes, and Algorithmic Bias: How YouTube Recommends Content to Real Users

!jupyter nbconvert _week-08_apis.ipynb --to html --template classic

[NbConvertApp] Converting notebook _week-08_apis.ipynb to html
[NbConvertApp] Writing 460171 bytes to _week-08_apis.html

	subscription_title	subscription_channel_id	subscription_kind	subscription_publish_date	collection_date
0	trueblood	UCPnlBOg4_NU9wdhRN-vzECQ	youtube#channel	1.395357e+09	2024-11-04 20:14:57.711685
1	GameofThrones	UCQzdMyuz0Lf4zo4uGcEujFw	youtube#channel	1.395357e+09	2024-11-04 20:14:57.711709
2	HBO	UCVTQuK2CaWaTgSsoNkn5AiQ	youtube#channel	1.395357e+09	2024-11-04 20:14:57.711728
3	HBOBoxing	UCWPQB43yGKEum3eW0P9N_nQ	youtube#channel	1.395357e+09	2024-11-04 20:14:57.711758
4	Cinemax	UCYbinjMxWwjRpp4WqgDqEDA	youtube#channel	1.424812e+09	2024-11-04 20:14:57.711776
5	HBODocs	UCbKo3HsaBOPhdRpgzqtRnqA	youtube#channel	1.395357e+09	2024-11-04 20:14:57.711794
6	HBOLatino	UCeKum6mhlVAjUFIW15mVBPg	youtube#channel	1.395357e+09	2024-11-04 20:14:57.711811
7	OfficialAmySedaris	UCicerXLHzJaKYHm1IwvTn8A	youtube#channel	1.461561e+09	2024-11-04 20:14:57.711828
8	Real Time with Bill Maher	UCy6kyFxaMqGtpE3pQTflK8A	youtube#channel	1.418342e+09	2024-11-04 20:14:57.711845

	video_id	channel_title	channel_id	video_publish_date	video_title	video_description	video_category	video_view_count	video_comment_count	video_like_count	video_dislike_count	video_thumbnail	collection_date
0	tWZAbKU-JzE	LastWeekTonight	UC3XTzVzaHQEd30rQbuvCtTQ	1.730723e+09	Election 2024: Last Week Tonight with John Oli...	A quick message ahead of Tuesday’s election. A...	24	2516789	11420	128113	None	https://i.ytimg.com/vi/tWZAbKU-JzE/hqdefault.jpg	2024-11-04 20:15:15.912716
1	t5lVEMpfRDI	LastWeekTonight	UC3XTzVzaHQEd30rQbuvCtTQ	1.730711e+09	S11 E28: Trump’s Businesses & Election 2024: 1...	John Oliver addresses undecided voters about K...	24	238712	857	9948	None	https://i.ytimg.com/vi/t5lVEMpfRDI/hqdefault.jpg	2024-11-04 20:15:15.912791
2	P6grAoS-muM	LastWeekTonight	UC3XTzVzaHQEd30rQbuvCtTQ	1.730387e+09	Lee Greenwood: Last Week Tonight with John Oli...	John Oliver discusses Lee Greenwood – the man ...	24	2406003	7340	88523	None	https://i.ytimg.com/vi/P6grAoS-muM/hqdefault.jpg	2024-11-04 20:15:15.912855
3	esymd1F1cRY	LastWeekTonight	UC3XTzVzaHQEd30rQbuvCtTQ	1.730099e+09	S11 E27: Mass Deportations & Lee Greenwood: 10...	John Oliver discusses Donald Trump’s plans to ...	24	396453	1403	14441	None	https://i.ytimg.com/vi/esymd1F1cRY/hqdefault.jpg	2024-11-04 20:15:15.912907
4	qXiEGPWVjGU	LastWeekTonight	UC3XTzVzaHQEd30rQbuvCtTQ	1.729494e+09	S6 E10: Lethal Injections, William Barr & Aust...	Season 6, episode 10. May 5th, 2019. John Oliv...	24	114653	183	2616	None	https://i.ytimg.com/vi/qXiEGPWVjGU/hqdefault.jpg	2024-11-04 20:15:15.912956

	video_id	commenter_channel_url	commenter_channel_id	commenter_channel_display_name	comment_id	comment_publish_date	text	commenter_rating	comment_parent_id	collection_date
0	tWZAbKU-JzE	http://www.youtube.com/@Alphadog9964	UCvIZU-3Fy2eoe_yiqcjXGzw	@Alphadog9964	UgzHwxVjKcrS-vkTjU14AaABAg	1.730787e+09	Thank you John, that was so well said.	none	None	2024-11-04 20:15:23.866875
1	tWZAbKU-JzE	http://www.youtube.com/@gleichg	UCLTfptuyoqzTZUHVpQs6_5A	@gleichg	Ugxs55-qoH2LCsoyeDt4AaABAg	1.730787e+09	I get the Arabic/Palestinian concern. But Je...	none	None	2024-11-04 20:15:23.866915
2	tWZAbKU-JzE	http://www.youtube.com/@matthewdonnelly3683	UCFKFfXJTCoJLDL8y4cEbDKg	@matthewdonnelly3683	Ugzz_WezClYsI7MLng14AaABAg	1.730787e+09	Why is this election close? I think all the me...	none	None	2024-11-04 20:15:23.866945
3	tWZAbKU-JzE	http://www.youtube.com/@unknownuser3000	UCkJ4J8IaUVn1IQHOwhb81eA	@unknownuser3000	UgxgSqd2fcxT1e_hsAV4AaABAg	1.730787e+09	Congratulations to Kamala in advance!	none	None	2024-11-04 20:15:23.866975
4	tWZAbKU-JzE	http://www.youtube.com/@segothgalont23	UCXDo9vjcBPJID1eJTRMeeJw	@segothgalont23	UgxNtNm94LyoOPLC5-14AaABAg	1.730787e+09	Well said.	none	None	2024-11-04 20:15:23.867001

	channel_title	video_title
0	Jornalismo TV Cultura	Possibilidade de fraude nas urnas eletrônicas ...
1	Canal Nostalgia	URNA ELETRÔNICA / Dá pra Hackear?
2	Band Jornalismo	Relatório não aponta fraude nas urnas
3	TV Brasil	Denúncias de fraude em urnas serão registradas...
4	Band Jornalismo	Auditorias descartam fraude em urnas eletrônicas
5	UOL	Fraude na urna eletrônica é improvável e nunca...
6	RedeTV	Eleitores reclamam de fraude nas urnas durante...
7	CNN Brasil	Pediram para eu adulterar urna com código-font...
8	Jornalismo TV Cultura	Ricardo Abramovay acredita que acusação de fra...
9	BBC News Brasil	Entenda 4 alegações falsas sobre fraude nas urnas

	type	difficulty	category	question	correct_answer	incorrect_answers
0	multiple	medium	General Knowledge	What is the highest number of Michelin stars a...	Three	[Four, Five, Six]
1	multiple	medium	General Knowledge	When was Nintendo founded?	September 23rd, 1889	[October 19th, 1891, March 4th, 1887, December...
2	multiple	easy	General Knowledge	Which of the following blood component forms a...	Platelets	[Red blood cells, White blood cells, Blood pla...
3	multiple	medium	General Knowledge	On average, Americans consume 100 pounds of wh...	Chocolate	[Potatoes, Donuts, Cocaine]
4	multiple	medium	General Knowledge	What is the romanized Japanese word for "...	Daigaku	[Toshokan, Jimusho, Shokudou]
5	multiple	medium	General Knowledge	Which river flows through the Scottish city of...	Clyde	[Tay, Dee, Tweed]
6	multiple	easy	General Knowledge	What machine element is located in the center ...	Bearings	[Axles, Gears, Belts]
7	multiple	easy	General Knowledge	What is the official language of Brazil?	Portugese	[Brazilian, Spanish, English]
8	multiple	medium	General Knowledge	Which Italian automobile manufacturer gained m...	Fiat	[Maserati, Alfa Romeo, Ferrari]
9	multiple	easy	General Knowledge	What zodiac sign is represented by a pair of s...	Libra	[Aries, Capricorn, Sagittarius]

PPOL 5203 Data Science I: Foundations Collecting Digital Data - API Tiago Ventura

Learning Goals¶

APIs 101¶

API Use-Cases¶

APIs Components¶

Requests to APIs¶

Example 1: Open Trivia API¶

Querying an API: Step-by-Step¶

Step 1: Documentation, Endpoints and Query.¶

Step 2: Use requests.get(querystring) to call the API¶

Step 3: Examine the response¶

Step 4: Extract your data.¶

Processing JSONs¶

Exploring API Filters¶

Filter with the full API cal¶

Or using dictionaries¶

Practice:¶

Example 2: Yelp API.¶

Authentication with an API¶

Acquiring credentials with Yelp Fusion API¶

How to save the API keys/token?¶

Querying the API¶

Step 0: Load your API Keys¶

Step 1: Look at the API documentation and endpoints, and construct a query of interest¶

Step 2: Use requests.get(endpoint) to call the API¶

Step 3: Examine the response¶

Step 4: Extract your data and save it.¶

Approach 1: Convert all to dataframe and clean it later¶

Approach 2: write a function to collect the information you need¶

Save the json¶

Practice¶

Example 3 : YouTube API¶

What kind of data can you get from the Youtube API?¶

How to Install¶

How to get an API key¶

A quick guide: https://developers.google.com/youtube/v3/getting-started¶

Starting with a channel name and getting some basic metadata¶

Channel metadata¶

Subscriptions of the channel.¶

List of videos of the channel¶

Collect video metadata¶

Search¶

PPOL 5203 Data Science I: Foundations

Collecting Digital Data - API

Tiago Ventura

Step 2: Use `requests.get(querystring)` to call the API¶

A quick guide: https://developers.google.com/youtube/v3/getting-started ¶