SQL - Overview and command cheat sheet

April 2nd 2016

Introduction to SQL and Databases

SQL (Structured Query Language)
SQL is a programming language that is designed to manage data, stored in relational databases. By using simple, declarative statements a programmer can access, add or modify data in tables.
Relational Databases
Relational databases structure information in one or more 'related' tables that consist of rows and columns
A column is a set of data values of a particular type (ie. a user id, age, name, etc.)
A row is a single record in a table (ie. A user)

In order to make the database 'fool-proof' and avoid pollution of the data, various checks and enforcements can and should be applied. When designing a database, you can for example specify the datatype of each column and require any input to meet that datatype:

  • Integer
  • Text (strings)
  • Date (YYYY-MM-DD)
  • REAL (Decimals
  • NULL (empty values)

Basic table manipulations

Notation Description Example
CREATE TABLE Clause that triggers the creation of a new table. When using this clause, you must name the table and specify the different columns and their respective datatypes as arguments CREATE TABLE table_name (
  column_1 datatype,
  column_2 datatype,
  column_3 datatype
);

ALTER TABLE ALTER TABLE is a statement that let's you make specific changes to the table. With this statement you can add, modify or remove column structures (or requirements) in the table. ALTER TABLE table ADD COLUMN column_name DATATYPE
INSERT INTO Clause that adds specified rows into the table. This clause takes as many arguments as there are columns in the table. INSERT INTO table (value_1, value_2, value_3);
UPDATE UPDATE is a clause that edits the row in a table. The update operator is used in combination with the keyword 'SET' and a condition (to select which rows to update) UPDATE table SET column = value WHERE condition;
DELETE FROM DELETE FROM is a statement that let's you delete one or more rows from the table. You must however specify which entries to delete with a conditional statement WHERE. DELETE FROM table WHERE column CONDITION

Basic SQL queries

SQL queries communicate with a database to retrieve information.

Notation Description Example
SELECT SELECT is a clause that indicates a query SELECT column FROM table;
FROM FROM clause indicates in which table(s) will be searched SELECT column FROM table;
DISTINCT SELECT DISTINCT returns unique values for specified columns SELECT DISTINCT column FROM table;
WHERE WHERE is a clause that indicates a filter on the result. Only the rows where the condition(s) return 'true' are retained. SELECT column FROM table WHERE (condition_column + condition);

Example: Select everything in a table called 'movies' where the entry has a rating of more than 8:
SELECT * FROM movies WHERE rating > 8;
LIKE LIKE is a special operator to compare similar values and identify patterns.

The LIKE statement is to be used in combination with the WHERE clause.
SELECT column FROM table WHERE column LIKE 'pattern';

Example: Select everything in a table called 'movies' where the name is related to 'Se_en'. The _ will be replaced by any possible character and return the entrie that match the pattern. For example 'Seven' and 'Se7en' are both valid movies.
SELECT * FROM movies WHERE name LIKE 'Se_en';

The '_' is a wildcard (more info below)
BETWEEN BETWEEN filters results, based on a specific range of alphanumerical characters

The BETWEEN statement is to be used in combination with the WHERE clause.
SELECT * FROM table WHERE column BETWEEN start_value AND end_value; Example: Select everything in a table called 'movies' where the release year is between 1990 and 2000.
SELECT * FROM movies WHERE year BETWEEN 1990 AND 2000;
AND AND is an operator that combines two or more conditions. All conditions must be 'true' for the entry to be retained in the result list
SELECT column FROM table WHERE condition_a AND condition_b;

Example: Select all movies of a specific genre within a specific time-period.
SELECT * FROM movies WHERE year BETWEEN 1990 AND 2000 AND genre = 'comedy';
OR OR is an operator that filters the result list on multiple conditions. All entries for which either condition returns 'true' is retained.
SELECT column FROM table WHERE condition_a OR condition_b;

Example: Select all movies of a specific genre OR a specific time-period.
SELECT * FROM movies WHERE year BETWEEN 1990 AND 2000 OR genre = 'comedy';
ORDER BY ORDER BY is a sorting operator that orders on an alpha-numerical order. You can chose either an ascending order (ASC) or descending order (DESC).
SELECT column FROM table ORDER BY order_column ASC;

Example: Return all the movies from highest to lowest rating.
SELECT * FROM movies ORDER BY rating DESC;
LIMIT LIMIT specifies the maximal number of rows to display in the result list.
SELECT column FROM table LIMIT 10;

Example: Select the top ten 'best rated' of movies.
SELECT * FROM movies ORDER BY rating DESC LIMIT 10;

Aggregate functions in SQL

Aggregate functions compute a single result from a set of input values (that are retrieved from a query). For example taking the average of all the values in a column.

Aggregate functions are very useful to transform the data in the exact format you want to use in the client-side application. By processing the raw data at the source, to the 'right' structure you can save a lot of bandwith and optimize the use of computing power.

Notation Description Example
COUNT() COUNT() returns the number of rows for which the value is not NULL SELECT COUNT(*) FROM table;
The returned value will be considered as a data-element for the SELECT statement.
GROUP BY() GROUP BY() arranges identical data in groups
The GROUP BY () command is used in the context of SELECT and is always used in combination with an aggregate function such as COUNT()
It's helpful to also SELECT the column you pass as an argument to GROUP BY ()!
SELECT parameter, COUNT(*) FROM table GROUP BY parameter;
A typical use would be to group products by their product categories on an e-commerce site, or group items by supplier, price etc.
SUM() SUM() adds all the values in a particular column SELECT SUM(parameter) FROM table;
MAX() MAX() returns the row (entry) with the highest value for a specific column. SELECT MAX(parameter) FROM table;
MIN() MIN() returns the row (entry) with the lowest value for a specific column. SELECT MIN(parameter) FROM table;
AVG() AVG() returns the average of all values in a specific column. SELECT AVG(parameter) FROM table;
ROUND() ROUND() rounds down the number of decimals as specified in the argument.
If no integer is specified in the argument, ROUND() will round the value up/down to the closest integer.
SELECT ROUND(INTGER) FROM table;

Order of operations in an SQL command

SQL SELECT statements can be composed of many different operations. Whereas most databases are relatively flexible about the order of statements, you can boost performance by writing the command in the order that the statements will be processed. This will reduce slow response times and increase efficiency.
For example the database must first retrieve the data before it can sort, group or make apply aggregate functions.

  1. FROM: Specifies the table to query from
  2. JOIN-TYPE: Specifies way the code will combine multiple tables
  3. ON: Keyword introducing the second table in the join condition
  4. WHERE: Introduces conditional statements, only the entries for which the condition returns 'true' will be added to the result list
  5. GROUP BY: Arrange identical data in groups
  6. HAVING: Conditional statement, similar to WHERE (but one that works with aggregate functions)
  7. SELECT: Clause that indicates a query
  8. DISTINCT: Specifies that only unique values should be considered
  9. ORDER BY: Sort elements in a specific alphanumerical order of a specific column
  10. TOP: Specify the number of records to return (Similar to the LIMIT clause)

Wildcards

Wildcards are special symbols that trigger specific behavior in the SQL language:

  • * : Select ALL
  • _ : The _ will be substituted by any possible character
  • % : Matches zero or more missing characters in a pattern
    • Example: A% : Matches all entries starting with the character 'A'
    • Example: %a : Matches all entries ending on 'a'
    • Example: %man% : Matches all entries starting and/or ending on 'man'