Up: Lectures

Databases

January 10th, 2012 Leave a comment Go to comments

This lecture is a quick introduction to databases and SQL, the language used to get information out of databases.

Requires: Nothing

Introduces: Databases

Motivating questions:

  1. My data is in table format across several related tables.  How can I explore and analyze it?
  2. I have a large amount of data that I want to access efficiently, how do I do that?

Lectures (using the SQLite database engine):

  1. Introduction
    • A database is a way to store and manipulate information that is arranged as tables.
    • You interact with a databases by posing queries (in SQL).
    • There are six basic operations which you can use to extract or transform your data.
  2. Selecting Data
    • The SQL SELECT command is used to extract data from the database.
    • You can list column names to extract from a table, or use * to mean all columns.
    • Along with column names, you can supply basic calculations on results.
    • The DISTINCT keyword removes duplicates from the results.
    • Exercises
  3. Filtering
    • The WHERE clause lets you filter rows that match a boolean expression.
    • The IN operator selects rows that match a list of values.
    • Exercises
  4. Sorting
    • The ORDER BY clause lets you sort the results.
    • The DESC or ASC keyword specifies the direction of the sort.
    • Exercises
  5. Aggregation
    • The SUM, MAX, MIN, and AVG functions combine values across multiple rows in the results.  This is called aggregation.
    • The GROUP BY clause lets you combine together rows that match certain conditions.
    • You can also filter (WHERE) and sort (ORDER BY) aggregated rows.
    • Exercises
  6. Joining Tables
    • The JOIN..ON clause lets you combine data from several tables into a single result.
    • Exercises
  7. Missing Data
    • Missing data can be represented by a special value in a database: NULL.
    • Filtering, joining, and aggregating data with NULL values takes special attention.
    • Exercises
  8. Nested Queries
    • Queries can operate on the results of other queries by nesting one query in another.
    • This allows you break down complex queries into simpler queries.
    • Queries can be nested in the FROM or the WHERE clauses.
    • Exercises

Exercises:

Lectures (using Microsoft Access as the database engine)

  1. Selecting Data with Microsoft Access
    • Select columns by adding names to query, or use * to retrieve all of the columns.
    • Set “Distinct Values” to “true” to eliminate duplicates.
    • Use expressions to generate calculated columns.
  2. Filtering Data with Microsoft Access
    • The criteria row lets you filter rows that match a boolean expression.
    • The IN operator selects rows that match a list of values.
  3. Sorting Data with Microsoft Access
    • The Sort option for each column lets you sort the results.
    • Sorting by columns that aren’t displayed.
  4. Aggregating Data with Microsoft Access
    • Totals let you count values and calculate sums, averages, minimum and maximum values
    • Group By aggregates the data into groups so that you can calculate Totals for each group
  5. Combining Data with Microsoft Access
    • Joins let you combine data from multiple tables
  6. Handling Missing Data with Microsoft Access
    • Missing data can be represented by a special value in a database: NULL.
    • Filtering, joining, and aggregating data with NULL values takes special attention.
  7. Nested Queries with Microsoft Access
    • Queries can operate on the results of other queries by nesting one query in another.
    • This allows you break down complex queries into simpler queries.

  1. No comments yet.