SQL expressions & operators provide a way to compute values from table data.
Operators provide ways to compute on or transform data; expressions allow for more complex logic in your queries. You can do things like conditional logic with
CASE statements, type casting, and more. We'll walk through some of the more common use cases here, but refer to the SQL functions docs to understand what else is possible. For most of these examples, let's assume we own a
my_table with a schema of
id int, val text.
Throughout the sections below, there are examples for how to use a few of these operators to check or compare values; most of those not covered here are covered are self-explanatory (e.g., arithmetic).
There are operators that can check if the operands are:
NOT can be combined with the named operators, e.g.,
NOT BETWEEN or
Arithmetic & output modifiers
You can also use operators to perform arithmetic operations or affect the output:
BETWEEN operator can be used to query for values between a specified criteria; it is inclusive. Let's say our table now has three rows:
In this example, we'll query for all rows with an
id between 2 and 3, meaning, we'll get back the last two rows:
id BETWEEN 2 AND 3;
Or, you could include the
NOT operator to do the inverse, which would give you only the first row:
id NOT BETWEEN 2 AND 3;
CASE statement can be viewed as a more powerful
IF statement. It allows you to define multiple conditions and their corresponding values. The syntax is as follows:
WHEN NULL IS NOT NULL THEN 'New val'
ELSE 'Other val'
WHEN NULL is demonstrating that you can use any expression in the
WHEN clause—instead of placing
NULL here, it'd be better to have some form of parameter binding at the client layer. The
ELSE clause is optional, but if you don't include it, the
CASE statement will return
NULL if none of the
WHEN conditions are met. In this example, the value
Other val will be inserted into the
val column for a new row with
So, altering this a bit, we could use parameter binding and control the four values at the client:
WHEN ?2 IS NOT NULL THEN ?3
CAST operation is particularly important when dealing with columns that store mixed types of data. The syntax follows
CAST(expression AS type) and converts the value of
type. For example, let's say our
val in a the table, which is a
TEXT column, stores the value
"1", and the
id column is of type
SELECT CAST(id as TEXT) FROM my_table;
SELECT CAST(val as INTEGER) FROM my_table;
The first query would return
"1" as a string, and the second query would return
1 as an integer.
The aggregate concatenation function (
group_concat()) is only for a group of data, such as concatenating all values in a column into a single string. If you wanted to concatenate each value in
my_table to have a custom string output, the double pipe (
||) can be used. It's not a function per se but is often useful in tandem with the JSON functions to further customize response data.
To combine two or more strings, use
|| between each value:
val || ' is #' || id
This will return transformed values for each row:
Bobby Tables is #1 and
Molly Tables is #2.
COLLATE operator can be used to specify a set of rules for comparing characters in a string. Collations determine how data is sorted and compared in a database, affecting operations like
ORDER BY, and G
ROUP BY. The syntax is
column_name COLLATE collation_name, and there are three options:
BINARY: The default collation. It compares string data using C's
memcmp()function, which results in a case-sensitive comparison based on the binary representations of the characters.
NOCASE: A case-insensitive comparison, which is useful when you want to treat
RTRIM: Trims trailing spaces before comparison, which is useful when comparing strings where the presence of trailing spaces should be ignored.
Let's change up our table data a bit to show how this works in more detail. We'll add rows that have the same set of
val values but with different casing, a duplicative
bobby tables, and the with the final
molly tables row having a few trailing spaces:
Now, let's run a few queries with different collations:
SELECT * FROM my_table ORDER BY val COLLATE NOCASE;
SELECT COUNT(*), val FROM my_table GROUP BY val COLLATE BINARY;
SELECT * FROM my_table WHERE val COLLATE NOCASE = 'bobby tables';
SELECT * FROM my_table WHERE val COLLATE RTRIM = 'molly tables';
Here are the results for each with the ordering or grouping:
The ordering moves the lowercase
bobby tablesup to a lower index:
id val 1 Bobby Tables 3 bobby tables 4 bobby tables 2 Molly Tables 5 molly tables 6 molly tables
The grouping sees duplicative
bobby tablesvalues, but all others are unique (e.g.,
molly tableswith trailing spaces are seen as separate values).
count val 1 Bobby Tables 1 Molly Tables 2 bobby tables 1 molly tables 1 molly tables
NOCASEis specified, the query will return all rows with
bobby tablesregardless of casing:
id val 1 Bobby Tables 3 bobby tables 4 bobby tables
RTRIMis specified, it will trim the trailing whitespace on
6and also return the row with
id val 5 molly tables 6 molly tables
If you want a query to only include unique data, the
DISTINCT clause will remove duplicate values from the return set. Take, for example, the desire to only include unique values in the
val column, which currently has two duplicate rows with
This will only return the two distinct
vals instead of one
Bobby Tables and two
EXISTS operator can be used to check if a subquery returns any rows. This is useful for checking if a row exists before inserting it, or for checking if a row exists before deleting it. For example, let's say we want to get the rows in
my_table that only exist in some
other_table with an
id column. The
other_table only contains a matching
id for the first row of
Thus, when we run the following:
other_table.id = my_table.id
It will return the first row of
my_table. There are other way to approach this with a
JOIN clause, but this is a simple example of how
EXISTS can be used. The
EXISTS take a boolean
0 value, so the subquery will only return
1 if the
id exists in
other_table. Similarly, using
NOT EXISTS will return the inverse.
Extracting JSON values
->> operators take a JSON string as their left operand and a path expression or object field label or array index as their right operand, returning the JSON of the subcomponent. Let's say our table has this data:
->, the result is returned as a JSON object or array, including any string values being returned with quotes:
val -> 'my_key'
This will give you the result:
To get the raw text value, the
->> operand extracts the JSON element as a text string, without the JSON formatting. For example, if we wanted to get the
my_key value from the
val column, we could do the following:
val ->> 'my_key'
GLOB can do Unix-like pattern matching but with case sensitivity and different wildcards. It can use
* to match any number of characters, including zero characters, or
? for exactly one character.
SELECT * FROM my_table WHERE val GLOB 'B*';
SELECT * FROM my_table WHERE val GLOB '[B,M]*';
SELECT * FROM my_table WHERE val GLOB '?o*';
In these examples, the first query would give us the first row since it finds row starting with
B. The second query would give us both rows since it looks in the range that includes
M. And the third query would also give us both rows because it matches any string that has
o as the second character.
IN operator is used to query for values in a specific set. For example,
WHERE id IN (1, 2, 3) gives all rows with
id in the set, which in our case, would be all rows. Let's put together a different example, where we check for rows with
val in a set that includes
val IN ('Bobby Tables');
This would give us the first row, alone.
LIKE operator, it can do case-insensitive pattern matching in string comparisons. For example, it can check if the
val column has a value that contains
bobby tables (i.e., doesn't worry about case):
val LIKE 'bobby tables';
% character is a wildcard. Perhaps we want to check if the
val column starts with
m, so we'd use
LIKE 'm%', giving us the row with
Molly Tables. Or, it could check for a substring anywhere within the value, such as
LIKE '%abl%', which would give us both rows since they both contain the word
Tables. A slightly more specific example there is with
%_abl%, which would only give us the rows where
val has any character followed by "abl" at the beginning.