{ "cells": [ { "cell_type": "markdown", "id": "4e380a95", "metadata": {}, "source": [ "

PPOL 5203 Data Science I: Foundations

\n", " Working with Databases

\n", "Tiago Ventura

\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "dd4eed89", "metadata": {}, "source": [ "## Learning Goals\n", "\n", "This notebook will cover the basics of working with databases. \n", "\n", "To do so, we will be working with a database called `country.sqlite` which you can download [here](https://www.dropbox.com/scl/fi/j6x5iykeudjyujp263sfz/country.sqlite?rlkey=g4ztum39u6pi74sp1f56rreir&dl=0). This database was developed by Dr. Eric Dunford for a previous iteration of this class. It contains a set of country-level information over years, and spread across different tables. \n" ] }, { "cell_type": "markdown", "id": "4a6ce269", "metadata": {}, "source": [ "### Using SQL through Pandas. \n", "\n", "As we saw in the slides, we can use SQL in Python and build a connection with database systems through pandas. Here are te steps: \n", "\n", "- Establish a connection between your local python environment and the database stored locally\n", "\n", "- Write a SQL query\n", "\n", "- Execute the query\n", "\n", "- Pull the result and work with the result in that language\n", "\n", "Let's see this in practice" ] }, { "cell_type": "code", "execution_count": 1, "id": "f727ab86", "metadata": {}, "outputs": [], "source": [ "## import pandas and sqlite3\n", "import pandas as pd\n", "import sqlite3" ] }, { "cell_type": "code", "execution_count": 2, "id": "807debda", "metadata": {}, "outputs": [], "source": [ "# Establish a connection with the database\n", "conn = sqlite3.connect(\"country.sqlite\")" ] }, { "cell_type": "code", "execution_count": 3, "id": "d01e50af", "metadata": {}, "outputs": [], "source": [ "# write a query\n", "query = \"select * from econ\"" ] }, { "cell_type": "code", "execution_count": 5, "id": "95109eb7", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'select * from econ'" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# just text, wuery\n", "query" ] }, { "cell_type": "code", "execution_count": 13, "id": "00ca005a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryccodeyearln_gdppcln_pop
0Algeria615.01997.07.38523717.212929
1Algeria615.01998.07.37053417.227837
2Algeria615.01999.07.36581617.241908
3Algeria615.02000.07.47137317.255405
4Algeria615.02001.07.45752717.268419
..................
910Zimbabwe552.02012.06.87540116.504094
911Zimbabwe552.02013.06.93380116.527188
912Zimbabwe552.02014.06.93838616.550636
913Zimbabwe552.02015.06.94062516.574092
914Zimbabwe552.02016.06.93641716.597453
\n", "

915 rows × 5 columns

\n", "
" ], "text/plain": [ " country ccode year ln_gdppc ln_pop\n", "0 Algeria 615.0 1997.0 7.385237 17.212929\n", "1 Algeria 615.0 1998.0 7.370534 17.227837\n", "2 Algeria 615.0 1999.0 7.365816 17.241908\n", "3 Algeria 615.0 2000.0 7.471373 17.255405\n", "4 Algeria 615.0 2001.0 7.457527 17.268419\n", ".. ... ... ... ... ...\n", "910 Zimbabwe 552.0 2012.0 6.875401 16.504094\n", "911 Zimbabwe 552.0 2013.0 6.933801 16.527188\n", "912 Zimbabwe 552.0 2014.0 6.938386 16.550636\n", "913 Zimbabwe 552.0 2015.0 6.940625 16.574092\n", "914 Zimbabwe 552.0 2016.0 6.936417 16.597453\n", "\n", "[915 rows x 5 columns]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Use the connection with the read_sql() method\n", "pd.read_sql(query,conn)" ] }, { "cell_type": "markdown", "id": "bc878747", "metadata": {}, "source": [ "As noticed before, the output is always a table. From now on, I will show you some different SQL commands and its applications " ] }, { "cell_type": "markdown", "id": "b43d3ced", "metadata": {}, "source": [ "#### listing all tables of your connection" ] }, { "cell_type": "code", "execution_count": 6, "id": "e27858ac", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
name
0regime
1life
2econ
3conflict
4continents
\n", "
" ], "text/plain": [ " name\n", "0 regime\n", "1 life\n", "2 econ\n", "3 conflict\n", "4 continents" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.read_sql(\"select name from sqlite_master where type='table'\",conn)" ] }, { "cell_type": "markdown", "id": "a6166fa4", "metadata": {}, "source": [ "#### Head top of the data" ] }, { "cell_type": "code", "execution_count": 7, "id": "fdff6e4a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryccodeyearinfant_mortlife_exp
0Algeria615.01997.035.569.323
1Algeria615.01998.034.969.745
2Algeria615.01999.034.470.183
3Algeria615.02000.033.970.640
4Algeria615.02001.033.371.116
\n", "
" ], "text/plain": [ " country ccode year infant_mort life_exp\n", "0 Algeria 615.0 1997.0 35.5 69.323\n", "1 Algeria 615.0 1998.0 34.9 69.745\n", "2 Algeria 615.0 1999.0 34.4 70.183\n", "3 Algeria 615.0 2000.0 33.9 70.640\n", "4 Algeria 615.0 2001.0 33.3 71.116" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# build query\n", "query = \"\"\" \n", " select * \n", " from life \n", " limit 5 \n", " \"\"\"\n", "\n", "#query database\n", "pd.read_sql(query,conn)" ] }, { "cell_type": "markdown", "id": "f308212e", "metadata": {}, "source": [ "#### Select specific variables " ] }, { "cell_type": "code", "execution_count": 8, "id": "19f1771c", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryyearpolity
0Algeria1997.0-3.0
1Algeria1998.0-3.0
2Algeria1999.0-3.0
3Algeria2000.0-3.0
4Algeria2001.0-3.0
\n", "
" ], "text/plain": [ " country year polity\n", "0 Algeria 1997.0 -3.0\n", "1 Algeria 1998.0 -3.0\n", "2 Algeria 1999.0 -3.0\n", "3 Algeria 2000.0 -3.0\n", "4 Algeria 2001.0 -3.0" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# build query\n", "query = \"\"\"select country, year, polity \n", " from regime \n", " limit 5\"\"\"\n", "\n", "#query database\n", "pd.read_sql(query,conn)" ] }, { "cell_type": "markdown", "id": "fd3ed594", "metadata": {}, "source": [ "## Row-wise manipulations" ] }, { "cell_type": "markdown", "id": "a1d6f9f8", "metadata": {}, "source": [ "#### arranging" ] }, { "cell_type": "code", "execution_count": 11, "id": "7be2c411", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryccodeyearriots
0Algeria6151997.00
1Algeria6151997.00
2Algeria6151997.00
3Algeria6151997.00
4Algeria6151997.00
\n", "
" ], "text/plain": [ " country ccode year riots\n", "0 Algeria 615 1997.0 0\n", "1 Algeria 615 1997.0 0\n", "2 Algeria 615 1997.0 0\n", "3 Algeria 615 1997.0 0\n", "4 Algeria 615 1997.0 0" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# build query\n", "query = \"\"\"select country, ccode, year, riots \n", " from conflict \n", " order by riots\"\"\"\n", "\n", "#query database\n", "pd.read_sql(query,conn).head()" ] }, { "cell_type": "markdown", "id": "8675d695", "metadata": {}, "source": [ "#### Arranging (descending order)" ] }, { "cell_type": "code", "execution_count": 12, "id": "09c5f416", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryccodeyearriots
0Egypt6512013.0124
1Burundi5162015.0120
2Egypt6512014.0120
3Tunisia6162018.0105
4Kenya5012007.0102
\n", "
" ], "text/plain": [ " country ccode year riots\n", "0 Egypt 651 2013.0 124\n", "1 Burundi 516 2015.0 120\n", "2 Egypt 651 2014.0 120\n", "3 Tunisia 616 2018.0 105\n", "4 Kenya 501 2007.0 102" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# build query\n", "\n", "query = \"\"\"select country, ccode, year, riots \n", " from conflict \n", " order by riots desc\"\"\"\n", "\n", "#query database\n", "pd.read_sql(query,conn).head()" ] }, { "cell_type": "markdown", "id": "e1c13bc7", "metadata": {}, "source": [ "#### Filtering " ] }, { "cell_type": "code", "execution_count": 13, "id": "bcec6116", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryccodeyearriots
0Nigeria4751997.03
1Nigeria4751997.03
2Nigeria4751997.04
3Nigeria4751997.01
4Nigeria4751997.04
\n", "
" ], "text/plain": [ " country ccode year riots\n", "0 Nigeria 475 1997.0 3\n", "1 Nigeria 475 1997.0 3\n", "2 Nigeria 475 1997.0 4\n", "3 Nigeria 475 1997.0 1\n", "4 Nigeria 475 1997.0 4" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"\"\"select country, ccode, year, riots \n", " from conflict \n", " where country = 'Nigeria' \"\"\"\n", "\n", "#query database\n", "pd.read_sql(query,conn).head()" ] }, { "cell_type": "markdown", "id": "89d118d7", "metadata": {}, "source": [ "#### Filtering multiple conditions (and)" ] }, { "cell_type": "code", "execution_count": 16, "id": "50d24940", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryccodeyearriots
0Nigeria4752000.02
1Nigeria4752000.01
2Nigeria4752000.06
3Nigeria4752000.02
4Nigeria4752000.04
\n", "
" ], "text/plain": [ " country ccode year riots\n", "0 Nigeria 475 2000.0 2\n", "1 Nigeria 475 2000.0 1\n", "2 Nigeria 475 2000.0 6\n", "3 Nigeria 475 2000.0 2\n", "4 Nigeria 475 2000.0 4" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query= \"\"\"\n", "select country, ccode, year, riots \n", " from conflict \n", " where country = 'Nigeria' and year >= 2000 \n", "\"\"\"\n", "#query database\n", "pd.read_sql(query,conn).head()" ] }, { "cell_type": "markdown", "id": "b377e79a", "metadata": {}, "source": [ "#### Filtering with OR" ] }, { "cell_type": "code", "execution_count": 17, "id": "c1f2df77", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryccodeyearriotsprotests
0Algeria6152001.0231
1Algeria6152001.0329
2Algeria6152001.0262
3Algeria6152011.08219
4Algeria6152011.01223
\n", "
" ], "text/plain": [ " country ccode year riots protests\n", "0 Algeria 615 2001.0 23 1\n", "1 Algeria 615 2001.0 32 9\n", "2 Algeria 615 2001.0 26 2\n", "3 Algeria 615 2011.0 82 19\n", "4 Algeria 615 2011.0 12 23" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"select country, ccode, year, riots, protests \\\n", " from conflict \\\n", " where riots > 20 or protests > 20 \"\n", "\n", "#query database\n", "pd.read_sql(query,conn).head()" ] }, { "cell_type": "markdown", "id": "b6cd797b", "metadata": {}, "source": [ "## Column-Wise Operations" ] }, { "cell_type": "markdown", "id": "9af7141b", "metadata": {}, "source": [ "#### Renaming variables" ] }, { "cell_type": "code", "execution_count": 18, "id": "8661fbf2", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
COUNTRY_NAMEYEARPOLITY
0Algeria1997.0-3.0
1Algeria1998.0-3.0
2Algeria1999.0-3.0
3Algeria2000.0-3.0
4Algeria2001.0-3.0
5Algeria2002.0-3.0
6Algeria2003.0-3.0
7Algeria2004.02.0
8Algeria2005.02.0
9Algeria2006.02.0
10Algeria2007.02.0
11Algeria2008.02.0
12Algeria2009.02.0
13Algeria2010.02.0
14Algeria2011.02.0
15Algeria2012.02.0
16Algeria2013.02.0
17Algeria2014.02.0
18Algeria2015.02.0
19Algeria2016.02.0
\n", "
" ], "text/plain": [ " COUNTRY_NAME YEAR POLITY\n", "0 Algeria 1997.0 -3.0\n", "1 Algeria 1998.0 -3.0\n", "2 Algeria 1999.0 -3.0\n", "3 Algeria 2000.0 -3.0\n", "4 Algeria 2001.0 -3.0\n", "5 Algeria 2002.0 -3.0\n", "6 Algeria 2003.0 -3.0\n", "7 Algeria 2004.0 2.0\n", "8 Algeria 2005.0 2.0\n", "9 Algeria 2006.0 2.0\n", "10 Algeria 2007.0 2.0\n", "11 Algeria 2008.0 2.0\n", "12 Algeria 2009.0 2.0\n", "13 Algeria 2010.0 2.0\n", "14 Algeria 2011.0 2.0\n", "15 Algeria 2012.0 2.0\n", "16 Algeria 2013.0 2.0\n", "17 Algeria 2014.0 2.0\n", "18 Algeria 2015.0 2.0\n", "19 Algeria 2016.0 2.0" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"select \\\n", " country as COUNTRY_NAME, \\\n", " year as YEAR, \\\n", " polity as POLITY \\\n", " from regime \\\n", " limit 20\"\n", "\n", "#query database\n", "pd.read_sql(query,conn)" ] }, { "cell_type": "markdown", "id": "3e9339a5", "metadata": {}, "source": [ "#### Mutating " ] }, { "cell_type": "code", "execution_count": 19, "id": "3144e992", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryccodeyearriotsprotestsdisplays
0Egypt6512011.055270325
1Algeria6152019.00302302
2Tunisia6162019.019257276
3Egypt6512013.0100175275
4Ethiopia5302015.06253259
\n", "
" ], "text/plain": [ " country ccode year riots protests displays\n", "0 Egypt 651 2011.0 55 270 325\n", "1 Algeria 615 2019.0 0 302 302\n", "2 Tunisia 616 2019.0 19 257 276\n", "3 Egypt 651 2013.0 100 175 275\n", "4 Ethiopia 530 2015.0 6 253 259" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"select country, ccode, year, riots, protests, \\\n", " riots + protests as displays \\\n", " from conflict \\\n", " order by displays desc\"\n", "\n", "#query database\n", "pd.read_sql(query,conn).head()" ] }, { "cell_type": "markdown", "id": "34220bc8", "metadata": {}, "source": [ "#### Mutating with ifelse logic " ] }, { "cell_type": "code", "execution_count": 20, "id": "320f5eef", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryccodeyearprotestsprotest_onset
0Algeria6151997.00No Protest
1Algeria6151997.00No Protest
2Algeria6151997.00No Protest
3Algeria6151997.00No Protest
4Algeria6151997.00No Protest
..................
8699Zimbabwe5522019.01Protest
8700Zimbabwe5522019.05Protest
8701Zimbabwe5522019.010Protest
8702Zimbabwe5522019.06Protest
8703Zimbabwe5522019.04Protest
\n", "

8704 rows × 5 columns

\n", "
" ], "text/plain": [ " country ccode year protests protest_onset\n", "0 Algeria 615 1997.0 0 No Protest\n", "1 Algeria 615 1997.0 0 No Protest\n", "2 Algeria 615 1997.0 0 No Protest\n", "3 Algeria 615 1997.0 0 No Protest\n", "4 Algeria 615 1997.0 0 No Protest\n", "... ... ... ... ... ...\n", "8699 Zimbabwe 552 2019.0 1 Protest\n", "8700 Zimbabwe 552 2019.0 5 Protest\n", "8701 Zimbabwe 552 2019.0 10 Protest\n", "8702 Zimbabwe 552 2019.0 6 Protest\n", "8703 Zimbabwe 552 2019.0 4 Protest\n", "\n", "[8704 rows x 5 columns]" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"select country, ccode, year, protests, \\\n", " (case when protests > 0 then 'Protest' \\\n", " else 'No Protest' end) as protest_onset \\\n", " from conflict \"\n", "\n", "#query database\n", "pd.read_sql(query,conn)" ] }, { "cell_type": "markdown", "id": "70974e31", "metadata": {}, "source": [ "#### Mutating the variable type" ] }, { "cell_type": "code", "execution_count": 21, "id": "2070e934", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryccodeyearriotsprotests
0Egypt6512013124.0123
1Burundi5162015120.083
2Egypt6512014120.075
3Tunisia6162018105.053
4Kenya5012007102.017
\n", "
" ], "text/plain": [ " country ccode year riots protests\n", "0 Egypt 651 2013 124.0 123\n", "1 Burundi 516 2015 120.0 83\n", "2 Egypt 651 2014 120.0 75\n", "3 Tunisia 616 2018 105.0 53\n", "4 Kenya 501 2007 102.0 17" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"select country, ccode, \\\n", " cast(year as integer) as year, \\\n", " cast(riots as float) as riots, \\\n", " protests \\\n", " from conflict \\\n", " order by riots desc \\\n", " limit 5\"\n", "\n", "#query database\n", "pd.read_sql(query,conn)" ] }, { "cell_type": "markdown", "id": "02358b53", "metadata": {}, "source": [ "## Summarization and Group by" ] }, { "cell_type": "markdown", "id": "f7c7f2c1", "metadata": {}, "source": [ "#### Distinct Values " ] }, { "cell_type": "code", "execution_count": 23, "id": "da20d234", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
country
0Algeria
1Angola
2Benin
3Burkina Faso
4Botswana
\n", "
" ], "text/plain": [ " country\n", "0 Algeria\n", "1 Angola\n", "2 Benin\n", "3 Burkina Faso\n", "4 Botswana" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"select \\\n", " distinct country \\\n", " from econ \"\n", "\n", "#query database\n", "pd.read_sql(query,conn).head()" ] }, { "cell_type": "markdown", "id": "6c1a55e8", "metadata": {}, "source": [ "#### Counting N observations" ] }, { "cell_type": "code", "execution_count": 24, "id": "f68f3fbd", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
n_obs
08704
\n", "
" ], "text/plain": [ " n_obs\n", "0 8704" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"select \\\n", " count(*) as n_obs \\\n", " from conflict \"\n", "\n", "#query database\n", "pd.read_sql(query,conn)" ] }, { "cell_type": "markdown", "id": "362daab3", "metadata": {}, "source": [ "#### Counting N distinct" ] }, { "cell_type": "code", "execution_count": 25, "id": "230e766e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
n_countries
049
\n", "
" ], "text/plain": [ " n_countries\n", "0 49" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"select \\\n", "count(distinct country) as n_countries \\\n", "from conflict \"\n", "\n", "#query database\n", "pd.read_sql(query,conn)" ] }, { "cell_type": "markdown", "id": "496e868a", "metadata": {}, "source": [ "#### Numerical Summaries" ] }, { "cell_type": "code", "execution_count": 26, "id": "0558c9d5", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ave_riotstotal_riotsmin_riotsmax_riots
02.46213720124
\n", "
" ], "text/plain": [ " ave_riots total_riots min_riots max_riots\n", "0 2.46 21372 0 124" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"select \\\n", " round(avg(riots),2) as ave_riots, \\\n", " sum(riots) as total_riots, \\\n", " min(riots) as min_riots, \\\n", " max(riots) as max_riots \\\n", " from conflict \"\n", "\n", "#query database\n", "pd.read_sql(query,conn)" ] }, { "cell_type": "markdown", "id": "20ea9de0", "metadata": {}, "source": [ "#### GROUP BY + Summarization" ] }, { "cell_type": "code", "execution_count": 28, "id": "d056deff", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryyearriotsprotests
0Algeria1997.000
1Algeria1998.001
2Algeria1999.000
3Algeria2000.002
4Algeria2001.09517
\n", "
" ], "text/plain": [ " country year riots protests\n", "0 Algeria 1997.0 0 0\n", "1 Algeria 1998.0 0 1\n", "2 Algeria 1999.0 0 0\n", "3 Algeria 2000.0 0 2\n", "4 Algeria 2001.0 95 17" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"select \\\n", " country, \\\n", " year, \\\n", " sum(riots) as riots, \\\n", " sum(protests) as protests \\\n", "from conflict \\\n", "group by country, year \"\n", "\n", "#query database\n", "pd.read_sql(query,conn).head()" ] }, { "cell_type": "markdown", "id": "4d526722", "metadata": {}, "source": [ "#### GROUP BY + Counting " ] }, { "cell_type": "code", "execution_count": 30, "id": "5d41a331", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryn
0Algeria272
1Angola193
2Benin88
3Botswana48
4Burkina Faso146
\n", "
" ], "text/plain": [ " country n\n", "0 Algeria 272\n", "1 Angola 193\n", "2 Benin 88\n", "3 Botswana 48\n", "4 Burkina Faso 146" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"select country, count(*) as n \\\n", " from conflict \\\n", " group by country \"\n", "\n", "#query database\n", "pd.read_sql(query,conn).head()" ] }, { "cell_type": "markdown", "id": "f01d604b", "metadata": {}, "source": [ "### Subqueries\n", "\n", "Subqueries allows us to combine several different operations across multiple tables" ] }, { "cell_type": "code", "execution_count": 32, "id": "c12aa4a6", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryyeartypen_events
0Algeria1997.0bombings11
1Algeria1997.0bombings0
2Algeria1997.0bombings2
3Algeria1997.0bombings0
4Algeria1997.0bombings1
\n", "
" ], "text/plain": [ " country year type n_events\n", "0 Algeria 1997.0 bombings 11\n", "1 Algeria 1997.0 bombings 0\n", "2 Algeria 1997.0 bombings 2\n", "3 Algeria 1997.0 bombings 0\n", "4 Algeria 1997.0 bombings 1" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \" WITH algeria_a as ( \\\n", " SELECT \\\n", " country,\\\n", " year,\\\n", " 'bombings' as type,\\\n", " explosions_remote_violence as n_events\\\n", " FROM conflict\\\n", " WHERE country = 'Algeria'),\\\n", " algeria_b as (\\\n", " SELECT \\\n", " country,\\\n", " year,\\\n", " 'vac' as type,\\\n", " violence_against_civilians as n_events\\\n", " FROM conflict\\\n", " WHERE country = 'Algeria')\\\n", " select * from algeria_a\\\n", " UNION ALL\\\n", " select * from algeria_b\\\n", "\"\n", "#query database\n", "pd.read_sql(query,conn).head()" ] }, { "cell_type": "markdown", "id": "e8d47560", "metadata": {}, "source": [ "## Joins" ] }, { "cell_type": "markdown", "id": "97946358", "metadata": {}, "source": [ "### INNER Joining Two Table" ] }, { "cell_type": "code", "execution_count": 34, "id": "a330a9fd", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryccodeyearinfant_mortlife_expln_gdppc
0Algeria615.01997.035.569.3237.385237
1Algeria615.01998.034.969.7457.370534
2Algeria615.01999.034.470.1837.365816
3Algeria615.02000.033.970.6407.471373
4Algeria615.02001.033.371.1167.457527
.....................
910Zimbabwe552.02012.045.755.0326.875401
911Zimbabwe552.02013.042.856.8976.933801
912Zimbabwe552.02014.040.458.4106.938386
913Zimbabwe552.02015.038.559.5346.940625
914Zimbabwe552.02016.036.360.2946.936417
\n", "

915 rows × 6 columns

\n", "
" ], "text/plain": [ " country ccode year infant_mort life_exp ln_gdppc\n", "0 Algeria 615.0 1997.0 35.5 69.323 7.385237\n", "1 Algeria 615.0 1998.0 34.9 69.745 7.370534\n", "2 Algeria 615.0 1999.0 34.4 70.183 7.365816\n", "3 Algeria 615.0 2000.0 33.9 70.640 7.471373\n", "4 Algeria 615.0 2001.0 33.3 71.116 7.457527\n", ".. ... ... ... ... ... ...\n", "910 Zimbabwe 552.0 2012.0 45.7 55.032 6.875401\n", "911 Zimbabwe 552.0 2013.0 42.8 56.897 6.933801\n", "912 Zimbabwe 552.0 2014.0 40.4 58.410 6.938386\n", "913 Zimbabwe 552.0 2015.0 38.5 59.534 6.940625\n", "914 Zimbabwe 552.0 2016.0 36.3 60.294 6.936417\n", "\n", "[915 rows x 6 columns]" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"\"\"select \n", " life.*,\n", " econ.ln_gdppc\n", "from life\n", "inner join econ on (life.ccode = econ.ccode and life.year=econ.year)\n", "\"\"\"\n", "#query database\n", "pd.read_sql(query,conn)" ] }, { "cell_type": "markdown", "id": "300f4988", "metadata": {}, "source": [ "#### using aliases when joining" ] }, { "cell_type": "code", "execution_count": 35, "id": "e294aebb", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryccodeyearinfant_mortlife_expln_gdppc
0Algeria615.01997.035.569.3237.385237
1Algeria615.01998.034.969.7457.370534
2Algeria615.01999.034.470.1837.365816
3Algeria615.02000.033.970.6407.471373
4Algeria615.02001.033.371.1167.457527
.....................
910Zimbabwe552.02012.045.755.0326.875401
911Zimbabwe552.02013.042.856.8976.933801
912Zimbabwe552.02014.040.458.4106.938386
913Zimbabwe552.02015.038.559.5346.940625
914Zimbabwe552.02016.036.360.2946.936417
\n", "

915 rows × 6 columns

\n", "
" ], "text/plain": [ " country ccode year infant_mort life_exp ln_gdppc\n", "0 Algeria 615.0 1997.0 35.5 69.323 7.385237\n", "1 Algeria 615.0 1998.0 34.9 69.745 7.370534\n", "2 Algeria 615.0 1999.0 34.4 70.183 7.365816\n", "3 Algeria 615.0 2000.0 33.9 70.640 7.471373\n", "4 Algeria 615.0 2001.0 33.3 71.116 7.457527\n", ".. ... ... ... ... ... ...\n", "910 Zimbabwe 552.0 2012.0 45.7 55.032 6.875401\n", "911 Zimbabwe 552.0 2013.0 42.8 56.897 6.933801\n", "912 Zimbabwe 552.0 2014.0 40.4 58.410 6.938386\n", "913 Zimbabwe 552.0 2015.0 38.5 59.534 6.940625\n", "914 Zimbabwe 552.0 2016.0 36.3 60.294 6.936417\n", "\n", "[915 rows x 6 columns]" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"\"\"select \n", " a.*,\n", " b.ln_gdppc\n", "from life a\n", "inner join econ b on (a.ccode = b.ccode and a.year=b.year)\n", "\"\"\"\n", "#query database\n", "pd.read_sql(query,conn)" ] }, { "cell_type": "markdown", "id": "eab409de", "metadata": {}, "source": [ "#### LEFT Joining Two Table" ] }, { "cell_type": "code", "execution_count": 36, "id": "19d0849a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryccodeyearinfant_mortlife_expln_gdppc
0Algeria615.01997.035.569.3237.385237
1Algeria615.01998.034.969.7457.370534
2Algeria615.01999.034.470.1837.365816
3Algeria615.02000.033.970.6407.471373
4Algeria615.02001.033.371.1167.457527
.....................
910Zimbabwe552.02012.045.755.0326.875401
911Zimbabwe552.02013.042.856.8976.933801
912Zimbabwe552.02014.040.458.4106.938386
913Zimbabwe552.02015.038.559.5346.940625
914Zimbabwe552.02016.036.360.2946.936417
\n", "

915 rows × 6 columns

\n", "
" ], "text/plain": [ " country ccode year infant_mort life_exp ln_gdppc\n", "0 Algeria 615.0 1997.0 35.5 69.323 7.385237\n", "1 Algeria 615.0 1998.0 34.9 69.745 7.370534\n", "2 Algeria 615.0 1999.0 34.4 70.183 7.365816\n", "3 Algeria 615.0 2000.0 33.9 70.640 7.471373\n", "4 Algeria 615.0 2001.0 33.3 71.116 7.457527\n", ".. ... ... ... ... ... ...\n", "910 Zimbabwe 552.0 2012.0 45.7 55.032 6.875401\n", "911 Zimbabwe 552.0 2013.0 42.8 56.897 6.933801\n", "912 Zimbabwe 552.0 2014.0 40.4 58.410 6.938386\n", "913 Zimbabwe 552.0 2015.0 38.5 59.534 6.940625\n", "914 Zimbabwe 552.0 2016.0 36.3 60.294 6.936417\n", "\n", "[915 rows x 6 columns]" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"\"\"select \n", " a.*,\n", " b.ln_gdppc\n", "from life a\n", "left join econ b on (a.ccode = b.ccode and a.year=b.year)\n", "\"\"\"\n", "#query database\n", "pd.read_sql(query,conn)" ] }, { "cell_type": "markdown", "id": "248d05a0", "metadata": {}, "source": [ "#### Joining on more than one table" ] }, { "cell_type": "code", "execution_count": 37, "id": "38ad81a6", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryccodeyearinfant_mortlife_expln_gdppcregime_type
0Algeria615.01997.035.569.3237.385237competitive authoritarian
1Algeria615.01998.034.969.7457.370534competitive authoritarian
2Algeria615.01999.034.470.1837.365816competitive authoritarian
3Algeria615.02000.033.970.6407.471373competitive authoritarian
4Algeria615.02001.033.371.1167.457527competitive authoritarian
........................
910Zimbabwe552.02012.045.755.0326.875401competitive authoritarian
911Zimbabwe552.02013.042.856.8976.933801competitive authoritarian
912Zimbabwe552.02014.040.458.4106.938386competitive authoritarian
913Zimbabwe552.02015.038.559.5346.940625competitive authoritarian
914Zimbabwe552.02016.036.360.2946.936417competitive authoritarian
\n", "

915 rows × 7 columns

\n", "
" ], "text/plain": [ " country ccode year infant_mort life_exp ln_gdppc \\\n", "0 Algeria 615.0 1997.0 35.5 69.323 7.385237 \n", "1 Algeria 615.0 1998.0 34.9 69.745 7.370534 \n", "2 Algeria 615.0 1999.0 34.4 70.183 7.365816 \n", "3 Algeria 615.0 2000.0 33.9 70.640 7.471373 \n", "4 Algeria 615.0 2001.0 33.3 71.116 7.457527 \n", ".. ... ... ... ... ... ... \n", "910 Zimbabwe 552.0 2012.0 45.7 55.032 6.875401 \n", "911 Zimbabwe 552.0 2013.0 42.8 56.897 6.933801 \n", "912 Zimbabwe 552.0 2014.0 40.4 58.410 6.938386 \n", "913 Zimbabwe 552.0 2015.0 38.5 59.534 6.940625 \n", "914 Zimbabwe 552.0 2016.0 36.3 60.294 6.936417 \n", "\n", " regime_type \n", "0 competitive authoritarian \n", "1 competitive authoritarian \n", "2 competitive authoritarian \n", "3 competitive authoritarian \n", "4 competitive authoritarian \n", ".. ... \n", "910 competitive authoritarian \n", "911 competitive authoritarian \n", "912 competitive authoritarian \n", "913 competitive authoritarian \n", "914 competitive authoritarian \n", "\n", "[915 rows x 7 columns]" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "query = \"\"\"select \n", " a.*,\n", " b.ln_gdppc,\n", " c.regime_type\n", "from life a\n", "left join econ b on (a.ccode = b.ccode and a.year=b.year)\n", "left join regime c on (a.ccode = c.ccode and a.year=c.year)\n", "\"\"\"\n", "#query database\n", "pd.read_sql(query,conn)" ] }, { "cell_type": "code", "execution_count": 38, "id": "da7f4d5b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[NbConvertApp] Converting notebook _week_13_sql.ipynb to html\n", "[NbConvertApp] Writing 347002 bytes to _week_13_sql.html\n" ] } ], "source": [ "!jupyter nbconvert _week_13_sql.ipynb --to html --template classic" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.4" } }, "nbformat": 4, "nbformat_minor": 5 }