Queries
Queries are used to retrieve data from the database.
We can choose which columns we want to see, which rows we want, and which aggregate calculations we want.
Queries are written with a SELECT statement:
SELECT [ALL | DISTINCT] column_list
[INTO [new_table_name]]
[FROM table_name [, table_name2 [..., table_name16]]
[INNER, LEFT OUTER, RIGHT OUTER JOIN]
[WHERE clause]
[GROUP BY clause]
[HAVING clause]
[ORDER BY clause]
[COMPUTE clause]
However, we won’t always need all those pieces. Let’s look at each clause individually.
WHERE Clause
We can use a WHERE clause to return only certain records. Think of it as a filter:
SELECT
FirstName
, LastName
FROM STUDENT
WHERE City = 'Edmonton'
-- only return rows where the City column is exactly the value "Edmonton"
Search Criteria Operators
There are many useful keywords and symbols we can include in our WHERE clauses.
| Operator | Usage |
|---|---|
= | looks for exact match |
<> or != | looks for something that does NOT match |
< | less than |
<= | less than or equal |
> | greater than |
>= | greater than or equal |
AND | both expressions must be true |
OR | either expression must be true |
BETWEEN | returns all records between 2 values (inclusive) |
IN | matches within a list of values |
NOT BETWEEN, NOT IN, etc | the NOT operator negates other operations |
LIKE | pattern matching |
UNION
The UNION operation lets you combine the data retrieved by multiple SELECT statements. It’s basically just gluing together the results of 2 or more queries, combining them into one output.
SELECT ...
UNION [ALL]
SELECT ...
Aggregate Functions
Aggregate functions are functions that aggregate a bunch of individual values into a single value.
Generally, the syntax is: aggregate_function_name([ ALL | DISTINCT] expression). For example:
AVGreturns the average of numeric values (ignoring anyNULLvalues)SUMreturns the sum of a column containing numeric valuesMINandMAXreturn the minimum and maximum values from a column of numeric, date, or character valuesCOUNTreturns the number of non-null values OR the number of records that match theWHEREcriteriaALLandDISTINCTcan be in a query or with aCOUNTfunction.ALLis the default and usually not explicitly declared.- When
DISTINCTis used with theCOUNTfunction it only counts the unique values.
- When
The GROUP BY clause is used with aggregate functions to provide subtotals. For example:
SELECT CourseID, AVG(Mark) AS AverageMark
FROM Registration
GROUP BY CourseID
-- what this means:
-- "for each course, select CourseID and the average mark"
HAVING Clause
HAVING is like the WHERE clause, except it applies its criteria after GROUP BY. For example:
SELECT CourseID, AVG(Mark) as AverageMark
FROM Registration
WHERE AVG(Mark) > 80
-- this line will fail: I can't check the AVG(Mark) until after I group-by,
-- i.e. I need to know the average mark per CourseID
GROUP BY CourseID
Instead:
SELECT CourseID, AVG(Mark) as AverageMark
FROM Registration
GROUP BY CourseID
HAVING AVG(Mark) > 80
ORDER BY
The ORDER BY clause sorts one or more columns, in ASCending or DESCending order (ASC is the default).
SELECT FirstName, LastName
FROM Student
ORDER BY FirstName ASC, LastName DESC
String Functions
Some useful functions include:
LEN(column | expression)returns the # of characters in a string or expressionLEFT(column | expression, length)returnslengthnumber of characters, starting at the leftRIGHT(column | expression, length)returnslengthnumber of characters, starting at the rightSUBSTRING(column | expression, start, length)returns a subset of characters from a string or expressionREVERSE(column | expression)returns the string in reverse orderUPPER(column | expression)returns the string in UPPERCASELOWER(column | expression)returns the string in lowercaseLTRIM(column | expression)trims any leading whitespaceRTRIM(column | expression)trims any trailing whitespace
Date Functions
GETDATE()returns the current system dateDATEADD(units, num, date)addsnumunitstodate(n may be negative)DATEDIFF(units, date1, date2)returns the number ofunitsfromdate1todate2DATENAME(units, date)returns string representation of theunitsofdateDATEPART(units, date)returns integer representation of theunitsofdateYEAR(date)functions the same asDATEPART(yy, date)MONTH(date)functions the same asDATEPART(mm, date)DAY(date)functions the same asDATEPART(dd, date)
where units represents some unit of time:
| Datepart | Abbreviation | Minimum | Maximum |
|---|---|---|---|
| Year | yy | 1753 | 9999 |
| Quarter | qq | 1 | 4 |
| Month | mm | 1 | 12 |
| Week | wk | 1 | 53 |
| Day of year | dy | 1 | 366 |
| Weekday | dw | 1 (Sun) | 7 (Sat) |
| Day | dd | 1 | 31 |
| Hour | hh | 0 | 23 |
| Minute | mi | 0 | 59 |
| Second | ss | 0 | 59 |
| Millisecond | ms | 0 | 999 |
JOINs
How do we connect data in one table to its related record(s) in another?
SELECT Field1, Field2, ...
FROM Table1
[INNER, LEFT OUTER, ...] JOIN Table2
ON Table1.JoinField = Table2.JoinField
-- the Join Field is generally the FK/PK that connects the tables, though technically we can JOIN by any field.
To see what that looks like with real fields/columns:
SELECT FirstName, LastName, Mark
FROM Student
INNER JOIN Registration ON Student.StudentID = Registration.StudentID
-- studentID is the FK
Types of Joins
INNER JOINreturns only records that exist in both tablesFULL OUTER JOINreturns all records that exist in either table (if we’reJOINing on our FK, which we always do in DMIT 1508, this type ofJOINisn’t used)Table1 LEFT JOIN Table2returns all records inTable1, regardless of whether they exist inTable2Table1 RIGHT JOIN Table2returns all tables inTable2, regardless of whether they exist inTable1
Selecting from 3+ tables
We can add additional JOIN statements:
SELECT Field1, Field2, ...
FROM Table1
INNER JOIN Table2
ON Table1.JoinField = Table2.JoinField
INNER JOIN Table3
ON Table.JoinField = Table3.JoinField
-- where "Table" is Table1 or Table2
INNER JOIN table4
ON Table.JoinField = table4.JoinField
-- where "Table" is Table1, Table2, or Table3
...
Subqueries
A subquery is a SELECT statement inside of another statement.
There are nested and correlated subqueries, and in this course we will focus on nested subqueries only, which are full and complete SELECT statements that could execute on their own.
They are especially useful for finding records that are in one table but not another: for example, staff who have never taught a course:
SELECT FirstName, LastName
FROM Staff
WHERE StaffID NOT IN (
-- the list of Staff who HAVE taught:
SELECT DISTINCT StaffID
FROM Offering
)
ANY, SOME, and ALL operators
What if want something other than an exact match?
WHERE StudentID IN (SELECT StudentID ... )
-- IN lets us compare to a bunch of values at once, unlike = which only compares to a single value
ANY or SOME compares against any of the values:
WHERE Grade > SOME (SELECT Grade ... )
-- this will be TRUE if Grade (in the outer query) is greater than at least one Grade in the inner query
ALL compares against all of the values:
WHERE Grade > ALL (SELECT Grade ...)
-- this will only be TRUE if Grade (in the outer query) is greater than every single Grade in the inner query