Using Conditional Statements
Contents
Using Conditional Statements#
Suppose we want to look at a subset of the data. We can use conditional statements to do this.
SELECT *
FROM ca_wac_2015
WHERE c000 < 100
LIMIT 10;
w_geocode | c000 | ca01 | ca02 | ca03 | ce01 | ce02 | ce03 | cns01 | cns02 | cns03 | cns04 | cns05 | cns06 | cns07 | cns08 | cns09 | cns10 | cns11 | cns12 | cns13 | cns14 | cns15 | cns16 | cns17 | cns18 | cns19 | cns20 | cr01 | cr02 | cr03 | cr04 | cr05 | cr07 | ct01 | ct02 | cd01 | cd02 | cd03 | cd04 | cs01 | cs02 | cfa01 | cfa02 | cfa03 | cfa04 | cfa05 | cfs01 | cfs02 | cfs03 | cfs04 | cfs05 | createdate | over100 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
060014001001007 | 30 | 2 | 16 | 12 | 4 | 2 | 24 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 25 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0 | 25 | 1 | 0 | 3 | 1 | 0 | 27 | 3 | 3 | 6 | 10 | 9 | 7 | 23 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20170919 | 0 |
060014001001008 | 4 | 0 | 1 | 3 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 2 | 0 | 0 | 4 | 0 | 0 | 2 | 1 | 1 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20170919 | 0 |
060014001001011 | 3 | 2 | 1 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 3 | 0 | 0 | 0 | 0 | 1 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20170919 | 0 |
060014001001017 | 11 | 3 | 3 | 5 | 2 | 2 | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 11 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10 | 0 | 0 | 1 | 0 | 0 | 11 | 0 | 0 | 0 | 3 | 5 | 10 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20170919 | 0 |
060014001001024 | 10 | 3 | 3 | 4 | 7 | 1 | 2 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 6 | 0 | 4 | 0 | 0 | 5 | 0 | 1 | 10 | 0 | 0 | 2 | 3 | 2 | 2 | 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20170919 | 0 |
060014001001026 | 3 | 0 | 2 | 1 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 2 | 0 | 0 | 2 | 1 | 0 | 0 | 2 | 1 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20170919 | 0 |
060014001001027 | 13 | 3 | 3 | 7 | 4 | 5 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 1 | 1 | 0 | 5 | 3 | 0 | 9 | 0 | 0 | 4 | 0 | 0 | 10 | 3 | 1 | 4 | 2 | 3 | 6 | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20170919 | 0 |
060014001001032 | 13 | 2 | 4 | 7 | 3 | 2 | 8 | 0 | 0 | 0 | 0 | 0 | 0 | 6 | 0 | 0 | 0 | 0 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 6 | 0 | 0 | 7 | 0 | 0 | 12 | 1 | 0 | 4 | 3 | 4 | 7 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20170919 | 0 |
060014001001033 | 2 | 0 | 0 | 2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20170919 | 0 |
060014001001034 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 20170919 | 0 |
Using a query like the one above can be useful for finding if there are any data entry errors or missing values. Since it’s not possible to have an age less 0, if there are any rows with negative age, this is likely an error or the method used to code missing values (e.g. -1
).
We can also use more complicated conditional statements.
SELECT count(*)
FROM ca_wac_2015
WHERE (c000 > 50) AND (c000 < 100);
count(*) |
---|
17877 |
This subsets to rows in which c000
is greater than 50 and c000
is less than 100. That is, this subsets to census blocks with between 50 and 100 total jobs. Using OR
works in the same way.
SELECT count(*)
FROM ca_wac_2015
WHERE (c000 <= 50) OR (c000 >= 100);
count(*) |
---|
225585 |
This subsets to rows in which c000
is less than or equal to 50 or c000
is greater than or equal to 100. This query should, in other words, capture the rest of the rows.
Common Comparison Operators#
Though there are some more complicated comparison operators (if you’re curious, feel free to look up what LIKE
and IN
do), these should cover most of what you want to do.
=
: equal to!=
or “<>
”: not equal to<
: less than<=
: less-than-or-equal-to>
: greater than>=
: greater-than-or-equal-toIS NULL
andIS NOT NULL
: The signifier of a row in a column not having a value is a special keyword:NULL
. To check forNULL
, you useIS NULL
orIS NOT NULL
, rather than “=” or “!=”. For example, to count the number of rows withNULL
values forc000
we might use the following:SELECT count(*) FROM ca_wac_2015 WHERE c000 IS NULL;
Creating Variables#
Suppose we want to create a new column in the table that acts as a “flag” for which rows fit a certain condition, so that you can use them later. We can do this using the ALTER TABLE
statement.
ALTER TABLE ca_wac_2015 ADD over100 BOOL;
UPDATE ca_wac_2015 SET over100 = 0;
UPDATE ca_wac_2015 SET over100 = 1 WHERE c000 > 100;
Let’s break this down line by line. First, we use ALTER TABLE
, then specify the table we want to alter. In this case, we want to alter the ca_wac_2015
table. Then, we ADD
a new column, over100
. We designate this as a BOOL
for boolean (that is, a TRUE/FALSE value) column.
After we create this new column, we need to fill it with the appropriate values. First, we’re going to set everything in the column to be 0 (or False). To do this, we use UPDATE
, specify the appropriate table, then use SET over100 = 0
. Then, we replace the value with 1 (or True) if the value in c000
for that row is above 100. We again use UPDATE
in a similar manner, except we add a WHERE
clause, so that it only set the value to TRUE if a certain condition is met – in this case, that c000 > 100
.
Checkpoint: Using Conditional Statements#
Answer the questions below, making sure to write out the queries used to answer the questions.
How many census blocks contain more than 200 jobs?
How many census blocks contain residences of fewer than 25 workers?
How many census blocks contain workplaces with more than 10 workers with a Bachelor’s degree or higher?