Filtering
Welcome
- Welcome to the Software Carpentry screencast on Databases. This
screencast is on filtering results from a databse.
Recap: SELECT
- As we’ve seen in a previous screencast, we can use the SELECT statement to
pull out columns from a table, and add computed values to results. - One of the most powerful features of a database is the ability to filter
through your data to find the records that match certain criteria.
Filtering results: WHERE
- Let’s say we wanted to see all of the Experiments before the year 1990.
- To do this we’d write a SELECT query with all of the columns from the
Experiments table,
SELECT *
FROM Experiment
and we use the WHERE command to specify our filter conditions
...
WHERE
- We put the expression, ExperimentDate < ’1990-01-01′.
SELECT *
FROM Experiment
WHERE ExperimentDate < '1900-01-01'
- This specifies that we want only experiment info with an experiment date
before January 1st, 1990. - You can think about how this query works as the database inspecting each
row from the Experiments table, and checking it against the condition in
the WHERE clause. If the condition holds true, then that row is included
in the results, otherwise it is filtered out. - We can use other comparison operators in our filtering conditions. For
example, we could ask for all of the Experiments in table that took 3 or
more hours. To do this we change the WHERE condition to be Hours >= 3
SELECT *
FROM Experiment
WHERE Hours >= 3
- We can use most of the familiar comparision operators: equal to, not equal
to, greater than, less than, and so on.
Filtering: logical operators
- We can make our WHERE conditions even more sophisticated by using logical
operators like AND and OR to combine conditions. - For instance, if we want to find all of the experiments run by mlom that
took more than three hours, we can modify our query. It already filters
for experiments that take more than 3 hours. We can add AND and then the
other condition that needs to be met, that the loginID is mlom:
...
WHERE Hours >= 3 AND LoginID = 'mlom'
- When using an AND operator, each record returned now must satisfy two
conditions: in this case, the Hours spent has to be greater than 3, and
the loginID must be mlom. - If we wanted experiments either by mlom or by best, we would write,
...
WHERE Hours >= 3 AND (LoginID = 'mlom' OR LoginID = 'best')
- The parentheses ensure that the OR-clause is evaluated first, so it is
true whether the loginID is mlom or best, and the entire WHERE condition
is true if the OR-clause is true AND the hours spent are greater than 3. - Another way to write this OR-clause is to use the IN operator.
- With this operator, we can write a condition that is true if a field –
the loginID in this case — contains anything from a list of values:
WHERE Hours >= 3 AND LoginID IN ('mlom', 'best')
- This is exactly the same as writing out the OR clause, but it’s shorter to
write, and easier to read.
Conclusion:
- In this screencast we’ve shown that you can select data from a table and
then filter those rows based on TRUE and FALSE conditions using a WHERE
clause.

For
ExperimentDate < ’1990-01-01′.
I wonder how the ordering here works, since the numbers are being treated as a string.