import pandas as pd
import numpy as np
from sqlalchemy import create_engine
import sqlite3

conn = sqlite3.connect("../../lodes.db")

Reading SQL tables using Pandas#

Now that we’ve created our engine to connect to the database, we can use the read_sql function in pandas to write SQL queries and get tables out as DataFrames.

df = pd.read_sql("SELECT * FROM ca_wac_2015",conn)

Here, pd.read_sql() outputs the table that the SQL query that we wrote as a string would return. In this case, it’s simply the ca_wac_2015 table. Of course, you can include more complicated queries, such as joins, if you’d like.

Let’s look at the data to make sure we got what we wanted.

df.head()
w_geocode c000 ca01 ca02 ca03 ce01 ce02 ce03 cns01 cns02 ... cfa03 cfa04 cfa05 cfs01 cfs02 cfs03 cfs04 cfs05 createdate over100
0 060014001001007 30 2 16 12 4 2 24 0 0 ... 0 0 0 0 0 0 0 0 20170919 0
1 060014001001008 4 0 1 3 0 0 4 0 0 ... 0 0 0 0 0 0 0 0 20170919 0
2 060014001001011 3 2 1 0 0 3 0 0 0 ... 0 0 0 0 0 0 0 0 20170919 0
3 060014001001017 11 3 3 5 2 2 7 0 0 ... 0 0 0 0 0 0 0 0 20170919 0
4 060014001001024 10 3 3 4 7 1 2 0 0 ... 0 0 0 0 0 0 0 0 20170919 0

5 rows × 54 columns

We can use more complicated SQL queries as well, bringing in portions of tables. This is more often what you’ll want to do, as tables in SQL can get quite large, and you may want data that has already been joined together.

query = """
SELECT w_geocode, c000, cbsaname, ctyname 
FROM ca_wac_2015 a
JOIN ca_xwalk b
ON a.w_geocode = b.tabblk2010;
"""

df = pd.read_sql(query,conn)

Here, we are first constructing a query (using the triple quotation marks to create the string over multiple lines for readability), then putting that query into the read_sql() function, which outputs the result of the query as a Data Frame. Let’s take a quick look at it.

df.head()
w_geocode c000 cbsaname ctyname
0 060014001001007 30 San Francisco-Oakland-Hayward, CA Alameda County, CA
1 060014001001008 4 San Francisco-Oakland-Hayward, CA Alameda County, CA
2 060014001001011 3 San Francisco-Oakland-Hayward, CA Alameda County, CA
3 060014001001017 11 San Francisco-Oakland-Hayward, CA Alameda County, CA
4 060014001001024 10 San Francisco-Oakland-Hayward, CA Alameda County, CA