{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Data\n",
    "\n",
    "In this book, we will use a set of public datasets from the Longitudinal Employer Household Dynamic (LEHD) data provided by the United States Census Bureau. In particular, we will use the LEHD Origin-Destination Employment Statistics (LODES) data. These data are based on tabulated administrative data and give information about workplaces and residences of workers at the census block level. There are four main types of data that we will use.\n",
    "- **Workplace Area Characteristics (WAC):** Census block level. Job totals for workplaces in the census block.\n",
    "- **Residence Area Characteristics (RAC):** Census block level. Job totals for residences in the census block.\n",
    "- **Origin-Destination (OD):** Origin census block - Destination census block pair level. \n",
    "- **Crosswalk (xwalk):** Census block level. Contains all census blocks within that state, and contains information about that census block (e.g. city, county)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Workplace Area Characteristics (WAC) and Residence Area Characteristics (RAC)\n",
    "\n",
    "The WAC and RAC data generally look something like the following:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "tags": [
     "hide_input"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>w_geocode</th>\n",
       "      <th>C000</th>\n",
       "      <th>CA01</th>\n",
       "      <th>CA02</th>\n",
       "      <th>CA03</th>\n",
       "      <th>CE01</th>\n",
       "      <th>CE02</th>\n",
       "      <th>CE03</th>\n",
       "      <th>CNS01</th>\n",
       "      <th>CNS02</th>\n",
       "      <th>...</th>\n",
       "      <th>CFA02</th>\n",
       "      <th>CFA03</th>\n",
       "      <th>CFA04</th>\n",
       "      <th>CFA05</th>\n",
       "      <th>CFS01</th>\n",
       "      <th>CFS02</th>\n",
       "      <th>CFS03</th>\n",
       "      <th>CFS04</th>\n",
       "      <th>CFS05</th>\n",
       "      <th>createdate</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>240010001001023</td>\n",
       "      <td>8</td>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>20190826</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>240010001001025</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>20190826</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>240010001001054</td>\n",
       "      <td>10</td>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>5</td>\n",
       "      <td>7</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>20190826</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>240010001001113</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>20190826</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>240010001002061</td>\n",
       "      <td>8</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "      <td>7</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>20190826</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 53 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "         w_geocode  C000  CA01  CA02  CA03  CE01  CE02  CE03  CNS01  CNS02  \\\n",
       "0  240010001001023     8     3     4     1     4     4     0      0      0   \n",
       "1  240010001001025     1     0     1     0     0     1     0      0      0   \n",
       "2  240010001001054    10     2     3     5     7     3     0      0      0   \n",
       "3  240010001001113     2     0     2     0     0     1     1      0      0   \n",
       "4  240010001002061     8     4     4     0     7     1     0      0      0   \n",
       "\n",
       "   ...  CFA02  CFA03  CFA04  CFA05  CFS01  CFS02  CFS03  CFS04  CFS05  \\\n",
       "0  ...      0      0      0      0      0      0      0      0      0   \n",
       "1  ...      0      0      0      0      0      0      0      0      0   \n",
       "2  ...      0      0      0      0      0      0      0      0      0   \n",
       "3  ...      0      0      0      0      0      0      0      0      0   \n",
       "4  ...      0      0      0      0      0      0      0      0      0   \n",
       "\n",
       "   createdate  \n",
       "0    20190826  \n",
       "1    20190826  \n",
       "2    20190826  \n",
       "3    20190826  \n",
       "4    20190826  \n",
       "\n",
       "[5 rows x 53 columns]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd \n",
    "URL = 'https://lehd.ces.census.gov/data/lodes/LODES7/md/wac/md_wac_S000_JT00_2015.csv.gz'\n",
    "pd.read_csv(URL, compression='gzip').head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, each of the rows represents a **census block** (this particular table contains data from Maryland). The `w_geocode` indicates the **block code**, serving as the unique identifier for the census block, and the `C000` variable represents the total number of jobs in that census block. The rest of the variable break down the number of jobs by various categories. For example, `CA01`, `CA02`, and `CA03` break down the jobs by age group:\n",
    "- `CA01`: Number of jobs for workers age 29 or younger\n",
    "- `CA02`: Number of jobs for workers age 30 to 54\n",
    "- `CA03`: Number of jobs for workers age 55 or older\n",
    "\n",
    "So, the sum of those columns should be equal to the value in `C000`. \n",
    "\n",
    "The same applies for the RAC data, except instead of the jobs in that census block, it shows the residences in the census block. So, the `C000` column in the RAC data represents all workers who lived in that census block. The `CA01`, `CA02`, and `CA03` variables represent the number of workers within each age group that lived in that census block. \n",
    "\n",
    "Note that for both of these datasets, the unit of observations is the census block."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Origin-Destination\n",
    "\n",
    "The Origin-Destination file looks like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "tags": [
     "hide_input"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>w_geocode</th>\n",
       "      <th>h_geocode</th>\n",
       "      <th>S000</th>\n",
       "      <th>SA01</th>\n",
       "      <th>SA02</th>\n",
       "      <th>SA03</th>\n",
       "      <th>SE01</th>\n",
       "      <th>SE02</th>\n",
       "      <th>SE03</th>\n",
       "      <th>SI01</th>\n",
       "      <th>SI02</th>\n",
       "      <th>SI03</th>\n",
       "      <th>createdate</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>240010001001023</td>\n",
       "      <td>240010001002184</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>20190826</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>240010001001023</td>\n",
       "      <td>240010001003108</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>20190826</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>240010001001023</td>\n",
       "      <td>240010002003023</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>20190826</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>240010001001023</td>\n",
       "      <td>240010022001060</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>20190826</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>240010001001023</td>\n",
       "      <td>240430107002095</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>20190826</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         w_geocode        h_geocode  S000  SA01  SA02  SA03  SE01  SE02  SE03  \\\n",
       "0  240010001001023  240010001002184     1     0     1     0     0     1     0   \n",
       "1  240010001001023  240010001003108     1     0     1     0     0     1     0   \n",
       "2  240010001001023  240010002003023     1     0     0     1     1     0     0   \n",
       "3  240010001001023  240010022001060     1     0     1     0     0     1     0   \n",
       "4  240010001001023  240430107002095     1     1     0     0     1     0     0   \n",
       "\n",
       "   SI01  SI02  SI03  createdate  \n",
       "0     0     1     0    20190826  \n",
       "1     0     1     0    20190826  \n",
       "2     0     1     0    20190826  \n",
       "3     0     1     0    20190826  \n",
       "4     0     1     0    20190826  "
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "URL = 'https://lehd.ces.census.gov/data/lodes/LODES7/md/od/md_od_main_JT00_2015.csv.gz'\n",
    "pd.read_csv(URL, compression='gzip').head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, each of the rows represents a `w_geocode`-`h_geocode` pair. That is, each row is a pair of census blocks for which there was at least one person who worked in the `w_geocode` census block and lived in the `h_geocode` census block. The `S000` variable represents how many people lived in the `h_geocode` census block and worked in the `w_geocode` census block."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Crosswalk"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "tags": [
     "hide_input"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>tabblk2010</th>\n",
       "      <th>st</th>\n",
       "      <th>stusps</th>\n",
       "      <th>stname</th>\n",
       "      <th>cty</th>\n",
       "      <th>ctyname</th>\n",
       "      <th>trct</th>\n",
       "      <th>trctname</th>\n",
       "      <th>bgrp</th>\n",
       "      <th>bgrpname</th>\n",
       "      <th>...</th>\n",
       "      <th>stanrcname</th>\n",
       "      <th>necta</th>\n",
       "      <th>nectaname</th>\n",
       "      <th>mil</th>\n",
       "      <th>milname</th>\n",
       "      <th>stwib</th>\n",
       "      <th>stwibname</th>\n",
       "      <th>blklatdd</th>\n",
       "      <th>blklondd</th>\n",
       "      <th>createdate</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>240037312011001</td>\n",
       "      <td>24</td>\n",
       "      <td>MD</td>\n",
       "      <td>Maryland</td>\n",
       "      <td>24003</td>\n",
       "      <td>Anne Arundel County, MD</td>\n",
       "      <td>24003731201</td>\n",
       "      <td>7312.01 (Anne Arundel, MD)</td>\n",
       "      <td>240037312011</td>\n",
       "      <td>1 (Tract 7312.01, Anne Arundel, MD)</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>99999</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>24001001</td>\n",
       "      <td>01 Anne Arundel WIA</td>\n",
       "      <td>39.086213</td>\n",
       "      <td>-76.536457</td>\n",
       "      <td>20211018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>240037012001003</td>\n",
       "      <td>24</td>\n",
       "      <td>MD</td>\n",
       "      <td>Maryland</td>\n",
       "      <td>24003</td>\n",
       "      <td>Anne Arundel County, MD</td>\n",
       "      <td>24003701200</td>\n",
       "      <td>7012 (Anne Arundel, MD)</td>\n",
       "      <td>240037012001</td>\n",
       "      <td>1 (Tract 7012, Anne Arundel, MD)</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>99999</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>24001001</td>\n",
       "      <td>01 Anne Arundel WIA</td>\n",
       "      <td>38.926495</td>\n",
       "      <td>-76.537151</td>\n",
       "      <td>20211018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>240037025001034</td>\n",
       "      <td>24</td>\n",
       "      <td>MD</td>\n",
       "      <td>Maryland</td>\n",
       "      <td>24003</td>\n",
       "      <td>Anne Arundel County, MD</td>\n",
       "      <td>24003702500</td>\n",
       "      <td>7025 (Anne Arundel, MD)</td>\n",
       "      <td>240037025001</td>\n",
       "      <td>1 (Tract 7025, Anne Arundel, MD)</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>99999</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>24001001</td>\n",
       "      <td>01 Anne Arundel WIA</td>\n",
       "      <td>38.951701</td>\n",
       "      <td>-76.550784</td>\n",
       "      <td>20211018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>240037027022009</td>\n",
       "      <td>24</td>\n",
       "      <td>MD</td>\n",
       "      <td>Maryland</td>\n",
       "      <td>24003</td>\n",
       "      <td>Anne Arundel County, MD</td>\n",
       "      <td>24003702702</td>\n",
       "      <td>7027.02 (Anne Arundel, MD)</td>\n",
       "      <td>240037027022</td>\n",
       "      <td>2 (Tract 7027.02, Anne Arundel, MD)</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>99999</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>24001001</td>\n",
       "      <td>01 Anne Arundel WIA</td>\n",
       "      <td>39.011417</td>\n",
       "      <td>-76.527626</td>\n",
       "      <td>20211018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>240037025004020</td>\n",
       "      <td>24</td>\n",
       "      <td>MD</td>\n",
       "      <td>Maryland</td>\n",
       "      <td>24003</td>\n",
       "      <td>Anne Arundel County, MD</td>\n",
       "      <td>24003702500</td>\n",
       "      <td>7025 (Anne Arundel, MD)</td>\n",
       "      <td>240037025004</td>\n",
       "      <td>4 (Tract 7025, Anne Arundel, MD)</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>99999</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>24001001</td>\n",
       "      <td>01 Anne Arundel WIA</td>\n",
       "      <td>38.947590</td>\n",
       "      <td>-76.538524</td>\n",
       "      <td>20211018</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 43 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "        tabblk2010  st stusps    stname    cty                  ctyname  \\\n",
       "0  240037312011001  24     MD  Maryland  24003  Anne Arundel County, MD   \n",
       "1  240037012001003  24     MD  Maryland  24003  Anne Arundel County, MD   \n",
       "2  240037025001034  24     MD  Maryland  24003  Anne Arundel County, MD   \n",
       "3  240037027022009  24     MD  Maryland  24003  Anne Arundel County, MD   \n",
       "4  240037025004020  24     MD  Maryland  24003  Anne Arundel County, MD   \n",
       "\n",
       "          trct                    trctname          bgrp  \\\n",
       "0  24003731201  7312.01 (Anne Arundel, MD)  240037312011   \n",
       "1  24003701200     7012 (Anne Arundel, MD)  240037012001   \n",
       "2  24003702500     7025 (Anne Arundel, MD)  240037025001   \n",
       "3  24003702702  7027.02 (Anne Arundel, MD)  240037027022   \n",
       "4  24003702500     7025 (Anne Arundel, MD)  240037025004   \n",
       "\n",
       "                              bgrpname  ...  stanrcname  necta  nectaname  \\\n",
       "0  1 (Tract 7312.01, Anne Arundel, MD)  ...         NaN  99999        NaN   \n",
       "1     1 (Tract 7012, Anne Arundel, MD)  ...         NaN  99999        NaN   \n",
       "2     1 (Tract 7025, Anne Arundel, MD)  ...         NaN  99999        NaN   \n",
       "3  2 (Tract 7027.02, Anne Arundel, MD)  ...         NaN  99999        NaN   \n",
       "4     4 (Tract 7025, Anne Arundel, MD)  ...         NaN  99999        NaN   \n",
       "\n",
       "   mil  milname     stwib            stwibname   blklatdd   blklondd  \\\n",
       "0  NaN      NaN  24001001  01 Anne Arundel WIA  39.086213 -76.536457   \n",
       "1  NaN      NaN  24001001  01 Anne Arundel WIA  38.926495 -76.537151   \n",
       "2  NaN      NaN  24001001  01 Anne Arundel WIA  38.951701 -76.550784   \n",
       "3  NaN      NaN  24001001  01 Anne Arundel WIA  39.011417 -76.527626   \n",
       "4  NaN      NaN  24001001  01 Anne Arundel WIA  38.947590 -76.538524   \n",
       "\n",
       "  createdate  \n",
       "0   20211018  \n",
       "1   20211018  \n",
       "2   20211018  \n",
       "3   20211018  \n",
       "4   20211018  \n",
       "\n",
       "[5 rows x 43 columns]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "URL = 'https://lehd.ces.census.gov/data/lodes/LODES7/md/md_xwalk.csv.gz'\n",
    "pd.read_csv(URL, compression='gzip').head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For more information about the datasets used in the examples, please refer to the data documentation provided [at this link](https://lehd.ces.census.gov/data/lodes/LODES7/LODESTechDoc7.4.pdf). "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}