SQL Specification

SQL Specification

The Tableland SQL specification.

Tableland understands a small subset of the standard SQL language. It does omit many features while at the same time adding a few features of its own. This document attempts to describe precisely what parts of the SQL language Tableland does and does not support. A list of supported data types is also provided. The SQL language supported by Tableland is a subset of the SQLite SQL language specification (and as such, we borrow heavily from their documentation with attribution), with additional constraints specific to Tableland operations.

This general SQL specification is broken down into two core sub-documents (which are embedded below). This specification is a living document, and as such, may be updated over time. Proposals for the addition of SQL language features and data types may be submitted by the Tableland community over time. These proposals will be evaluated for technical feasibility, utility to the community, and longer-term sustainability.

Statement Types

The core Tableland SQL parser accepts an SQL statement list which is a semicolon-separated list of statements. Each SQL statement in the statement list is an instance of one of the following specific statement types. All other standard SQL statement types are unavailable (at the moment). Each statement type is associated with a well-known SQL command (see following sections). In general, the entire Tableland SQL API can be summarized in seven command/statement types: CREATE TABLE, INSERT, UPDATE, DELETE, SELECT, GRANT, REVOKE.

The statement and data types provided here are part of the official minimal Tableland SQL specification. Additional functionality may be available in practice. However, it is not recommended that developers rely on SQL features outside of this minimal specification in the long-term.


The CREATE TABLE command is used to create a new table on Tableland. A CREATE TABLE command specifies the following attributes of the new table:

  • The name of the new table (table_name).
  • The name of each column in the table (column_name).
  • The declared type of each column in the table (data_type).
  • A default value or expression for each column in the table.
  • Optionally, a PRIMARY KEY for the table. Both single column and composite (multiple column) primary keys are supported.
  • A set of SQL constraints for the table. Tableland supports UNIQUE, NOT NULL, CHECK and PRIMARY KEY constraints (see previous bullet).
  • Optionally, a generated column constraint.


CREATE TABLE table_name ( [
  { column_name data_type [ column_constraint [,  ... ] ]
  | table_constraint }
	[, ...]
] );

where column_constraint has structure

[ CONSTRAINT constraint_name ]
  NULL |
  CHECK ( expression ) |
  DEFAULT default_expr |
  UNIQUE index_parameters |
  PRIMARY KEY index_parameters |

and table_constraint has structure

[ CONSTRAINT constraint_name ]
{ CHECK ( expression ) |
  UNIQUE ( column_name [, ... ] ) |
  PRIMARY KEY ( column_name [, ... ] )


Table Identifiers/Names

Every CREATE TABLE statement must specify a fully-qualified table identifier (id) as the name of the new table. The fully-qualified table identifier has the following structure:

table_id = {prefix}_{chain_id}_{token_id}

Where the id prefix is optional, and may include any characters from the regular expression ([A-Za-z0-9\_]+), but cannot start with a number. A prefix string may be up to 32 bytes in length. In practice, long names with spaces must be slug-ified with underscores. For example, “my amazing table" would become "my_amazing_table". The last two components of the table id, must be the chain id and the token id, which are numeric values separated by an underscore. For example, a valid table id without a prefix looks like _42_0, whereas a valid table id with a prefix might look like dogs_42_0.

It is not up to the caller to determine what token id to use in a CREATE TABLE statement. The token id is a monotonically-increasing numeric value which is provided by the smart contract that is processing the create statements. See the On-Chain API Specification for details on the smart contract calls used to generate CREATE TABLE statements in practice.

Table identifiers must be globally unique. The combination of chain id and monotonically increasing token id ensures this is the case in practice. As such, the addition of a user-defined prefix string is an aesthetic feature that most developers will find useful (but is not required). The maximum (slug-ified) prefix length is 32 bytes.

Tableland also supports quoted identifiers (for table names, column names, etc). This allows callers to use SQL Keywords (see next section) as part of identifiers, etc. There are some limitations to this, and it does not circumvent any other naming constraints.

Reserved Keywords

The SQL standard specifies a large number of keywords which may not be used as the names of tables, indices, columns, databases, or any other named object. The list of keywords is often so long that few people can remember them all. For most SQL code, your safest bet is to never use any English language word as the name of a user-defined object.

If you want to use a keyword as a name, you need to quote it. There are four ways of quoting keywords in SQLite:

  • 'keyword' — A keyword in single quotes is a string literal.
  • "keyword" — A keyword in double-quotes is an identifier.
  • [keyword] — A keyword enclosed in square brackets is an identifier. This is not standard SQL, it is included in Tableland for compatibility.
  • `keyword` — A keyword enclosed in grave accents (ASCII code 96) is an identifier. This is not standard SQL, it is included in Tableland for compatibility.

The list below shows all possible reserved keywords used by Tableland (or SQLite). Any identifier that is not on the following element list is not considered a keyword to the SQL parser in Tableland:


You can also find the most up to date list of keywords used by Tableland in the reference parser implementation. See Implementation
Table names that begin with sqlite, system or registry are also reserved for internal use. It is an error to attempt to create a table with a name that starts with these reserved names.

Column Definitions and Constraints

Every CREATE TABLE statement includes one or more column definitions, optionally followed by a list of table constraints. Each column definition consists of the name of the column, followed by the declared type of the column (see

), then one or more optional column constraints. Included in the definition of column constraints for the purposes of the previous statement is the DEFAULT clause, even though this is not really a constraint in the sense that it does not restrict the data that the table may contain. The other constraints, NOT NULL, CHECK, UNIQUE, and PRIMARY KEY constraints, impose restrictions on the table data.

The number of columns in a table is limited by the MaxColumns validator configuration parameters (defaults to 24). A single character fields in a table cannot store more than MaxTextLength bytes of data (defaults to 1024). The number of rows in a table is limited by the MaxRowCount validator configuration parameter (defaults to 100,000). This values are all configurable at the network-level, and may change in the future.
In practice, a CREATE TABLE statement must be sent as a single top-level statement (i.e., it must be provided in a statement list of length one).
Feature At Risk: FOREIGN KEY constraints of the form FOREIGN KEY(column_name) REFERENCES table_id(column_name) are currently not supported across Tableland tables. Instead, dynamic JOINs can be used to reference columns in remote tables. However, inclusion of FOREIGN KEY constraints are being considered for inclusion in the Tableland SQL specification with some specific limitations. In particular, key constraint actions would be restricted to SET NULL or SET DEFAULT (see the section called SQLite foreign key constraint actions at the link below). See SQLite Foreign Key

Column Defaults

The DEFAULT clause specifies a default value to use for the column if no value is explicitly provided by the user when doing an INSERT. If there is no explicit DEFAULT clause attached to a column definition, then the default value of the column is NULL. An explicit DEFAULT clause may specify that the default value is NULL, a string constant, a blob constant, a signed-number, or any constant expression enclosed in parentheses. For the purposes of the DEFAULT clause, an expression is considered constant if it contains no sub-queries, column, or table references, or string literals enclosed in double-quotes instead of single-quotes.

Each time a row is inserted into the table by an INSERT statement that does not provide explicit values for all table columns the values stored in the new row are determined by their default values, as follows:

  • If the default value of the column is a constant NULL, text, blob or signed-number value, then that value is used directly in the new row.
  • If the default value of a column is an expression in parentheses, then the expression is evaluated once for each row inserted and the results used in the new row.

Generated Columns

A column that includes a GENERATED ALWAYS AS clause is a generated column:

CREATE TABLE table_id (
    column_name data_type { GENERATED ALWAYS } AS (expression) { STORED | VIRTUAL }

Generated columns (also sometimes called "computed columns") are columns of a table whose values are a function of other columns in the same row. Generated columns can be read, but their values can not be directly written. The only way to change the value of a generated column is to modify the values of the other columns used to calculate the generated column.

The GENERATED ALWAYS keywords at the beginning of the constraint and the VIRTUAL or STORED keyword at the end are all optional. Only the AS keyword and the parenthesized expression are required. If the trailing VIRTUAL or STORED keyword is omitted, then VIRTUAL is the default.

The value of a VIRTUAL column is computed when read, whereas the value of a STORED column is computed when the row is written. STORED columns take up space in the database file, whereas VIRTUAL columns use more CPU cycles when being read.

Features and Limitations

  • Generated columns must also have a defined data type (just like all columns in Tableland). Tableland will attempt to transform the result of the generating expression into that data type using the same affinity rules as for ordinary columns.
  • Generated columns may have NOT NULL, CHECK, and UNIQUE constraints, just like ordinary columns.
  • The expression of a generated column can refer to any of the other declared columns in the table, including other generated columns, as long as the expression does not directly or indirectly refer back to itself.
  • Generated columns may not have a DEFAULT clause. The value of a generated column is always the value specified by the expression that follows the AS keyword.
  • Generated columns may not be used as part of the PRIMARY KEY.
  • The expression of a generated column may only reference constant literals and columns within the same row, and may only use scalar deterministic functions. The expression may not use subqueries, aggregate functions, etc.
  • The expression of a generated column may refer to other generated columns in the same row, but no generated column can depend upon itself, either directly or indirectly.
  • Every table must have at least one non-generated column.
  • The data type of the generated column is determined only by the declared data type from the column definition. The datatype of the GENERATED ALWAYS AS expression has no affect on the data type of the column data itself.

Primary Key

Each table in Tableland may have at most one PRIMARY KEY. If the keywords PRIMARY KEY are added to a column definition, then the primary key for the table consists of that single column. Or, if a PRIMARY KEY clause is specified as a separate table constraint, then the primary key of the table consists of the list of columns specified as part of the PRIMARY KEY clause. The PRIMARY KEY clause must contain only column names. An error is raised if more than one PRIMARY KEY clause appears in a CREATE TABLE statement. The PRIMARY KEY is optional.

Each row in a table with a primary key must have a unique combination of values in its primary key columns. If an INSERT or UPDATE statement attempts to modify the table content so that two or more rows have identical primary key values, that is a constraint violation. Related, the SQL standard is that a PRIMARY KEY should always be NOT NULL, so Tableland enforces this constraint.

Feature At Risk: It is not currently possible to ALTER TABLE after it has been created. As such, table structure in Tableland is considered immutable. The Tableland core development team is currently evaluating whether to allow ALTER TABLE only in the case of adding new columns.


The DELETE command removes records from the table identified by the table id.


DELETE FROM table_id [ WHERE condition ]


If the WHERE clause is not present, all records in the table are deleted. If a WHERE clause is supplied, then only those rows for which the WHERE clause boolean expression is true are deleted. Rows for which the expression is false or NULL are retained.


The INSERT command creates new rows in a table identified by the table id.


INSERT INTO table_id [ ( column_name [, ...] ) ] VALUES (
  { expression } [, ...]




An INSERT statement creates one or more new rows in an existing table. If the column_name list after table_name is omitted then the number of values inserted into each row must be the same as the number of columns in the table. In this case the result of evaluating the left-most expression from each term of the VALUES list is inserted into the left-most column of each new row, and so forth for each subsequent expression. If a column_name list is specified, then the number of values in each term of the VALUE list must match the number of specified columns. Each of the named columns of the new row is populated with the results of evaluating the corresponding VALUES expression. Table columns that do not appear in the column list are populated with the default column value (specified as part of the CREATE TABLE statement), or with NULL if no default value is specified.

The alternative INSERT ... DEFAULT VALUES statement inserts a single new row into the named table. Each column of the new row is populated with its default value, or with a NULL if no default value is specified as part of the column definition in the CREATE TABLE statement.


An UPDATE statement is used to modify a subset of the values stored in zero or more rows of the database table identified by the table id.


UPDATE table_name
    SET { column_name = { expression | DEFAULT } } [, ...]
    [ WHERE condition ];


If the UPDATE statement does not have a WHERE clause, all rows in the table are modified by the UPDATE. Otherwise, the UPDATE affects only those rows for which the WHERE clause boolean expression is true. It is not an error if the WHERE clause does not evaluate to true for any row in the table; this just means that the UPDATE statement affects zero rows.

The modifications made to each row affected by an UPDATE statement are determined by the list of assignments following the SET keyword. Each assignment specifies a column-name to the left of the equals sign and a scalar expression to the right. For each affected row, the named columns are set to the values found by evaluating the corresponding scalar expressions. If a single column-name appears more than once in the list of assignment expressions, all but the rightmost occurrence is ignored. Columns that do not appear in the list of assignments are left unmodified. The scalar expressions may refer to columns of the row being updated. In this case all scalar expressions are evaluated before any assignments are made.

An assignment in the SET clause can be a parenthesized list of column names on the left and a ROW value of the same size on the right. For example, consider the following two “styles” of UPDATE statements: UPDATE table_id SET (a,b)=(b,a); or UPDATE table_id SET a=b, b=a;.


The GRANT and REVOKE commands are used to define low-level access privileges for a table identified by table name and id.


    ON { [ TABLE ] table_name [, ...] }
    TO role [, ...]

    ON { [ TABLE ] table_name [, ...] }
    FROM role [, ...]


The GRANT command gives specific privileges on a table to one or more role. These privileges are added to those already granted, if any. By default, the creator of a table (as specified by a public ETH address) has all (valid) privileges on creation. The owner could, however, choose to revoke some of their own privileges for safety reasons.

Related, if a table is created with an access controller contract specified, or if an address with sufficient privileges updates a table’s access control rules to use a controller contract, then all command-based access control rules are ignored in favor of the controller contract access control. In other words, if a controller contract is set, GRANT/REVOKE is disabled. See On-Chain API Specification for further details on specifying and controlling access via a controller smart contract.

Currently, the only allowable privileges for granting are INSERT, UPDATE and DELETE. Note that SELECT privileges are not required at this time, as SELECT statements are not access controlled (all reads are allowed). See the SELECT section for further details.

Roles (role) in Tableland are defined by an Ethereum public-key based address. Any (hex string encoded) ETH address is a valid Tableland role, and as such, privileges can be granted to any valid ETH address. In practice, ETH address strings must be specified as string literals using single quotes (e.g., '0x181Ec6E8f49A1eEbcf8969e88189EA2EFC9108dD').

Only a table owner has permission to GRANT or REVOKE access privileges to other roles/accounts.

Conversely to the GRANT command, the REVOKE command removes specific, previously granted access privileges on a table from one or more roles. All role definitions and allowable privileges associated with granting privileges also apply to revoking them.


The SELECT statement is used to query the database. The result of a SELECT is zero or more rows of data where each row has a fixed number of columns. A SELECT statement does not make any changes to the database.


The SELECT statement is the work-house of the SQL query model, and as such, the available syntax is extremely complex. In practice, most SELECT statements are simple SELECT statements of the form:

    [ * | expression [, ...] ]
    [ FROM from_clause [, ...] ]
    [ WHERE where_clause ]
    [ GROUP BY [ expression [, ...] ]
    [ ORDER BY expression [ ASC | DESC ]
    [ LIMIT { count | ALL } ]
    [ OFFSET { number } ]

See the standalone sections on WHERE, FROM, and JOIN clauses for further details on query structure.


Generating the results of a simple SELECT statement is presented as a four step process in the description below:

  1. FROM clause processing: The input data for the simple SELECT is determined. The input data is either implicitly a single row with 0 columns (if there is no FROM clause) or is determined by the FROM clause.
  2. WHERE clause processing: The input data is filtered using the WHERE clause expression.
  3. Result set processing (GROUP BY and result expression processing): The set of result rows is computed by aggregating the data according to any GROUP BY clause and calculating the result set expressions for the rows of the filtered input dataset.
  4. DISTINCT/ALL keyword processing: If the query is a SELECT DISTINCT query (see further details below), duplicate rows are removed from the set of result rows.

There are two types of simple SELECT statement — aggregate and non-aggregate queries. A simple SELECT statement is an aggregate query if it contains either a GROUP BY clause or one or more aggregate functions in the result set. Otherwise, if a simple SELECT contains no aggregate functions or a GROUP BY clause, it is a non-aggregate query.

Once the input data from the FROM clause has been filtered by the WHERE clause expression (if any), the set of result rows for the simple SELECT are calculated. Exactly how this is done depends on whether the simple SELECT is an aggregate or non-aggregate query, and whether or not a GROUP BY clause was specified.

One of the ALL or DISTINCT keywords may follow the SELECT keyword in a simple SELECT statement. If the simple SELECT is a SELECT ALL, then the entire set of result rows are returned by the SELECT. If neither ALL or DISTINCT are present, then the behavior is as if ALL were specified. If the simple SELECT is a SELECT DISTINCT, then duplicate rows are removed from the set of result rows before it is returned. For the purposes of detecting duplicate rows, two NULL values are considered to be equal.

WHERE clause


WHERE condition


The SQL WHERE clause is an optional clause of the SELECT, DELETE, and/or UPDATE statements. It appears after the primary clauses of the corresponding statement. For example in a SELECT statement, the WHERE clause can be added after the FROM clause to filter rows returned by the query. Only rows for which the WHERE clause expression evaluates to true are included from the dataset before continuing. Rows are excluded from the result if the WHERE clause evaluates to either false or NULL.

When evaluating a SELECT statement with a WHERE clause, Tableland uses the following steps:

  1. Determine the table(s) in the FROM clause,
  2. Evaluate the conditions in the WHERE clause to determine the rows that meet the given condition,
  3. Generate the final result set based on the rows in the previous step, with columns subset to match the SELECT statement.
The search condition in the WHERE clause is made up of any number of comparisons (=, <, >, LIKE, IN, etc), combined using a range of logical operators (e.g., OR, AND, ALL, ANY, etc).

FROM clause


FROM { table_name [ * ] [ [ AS ] alias ] | ( sub_select ) [ AS ] alias }


The input data used by a simple SELECT query is a set of N rows each M columns wide. If the FROM clause is omitted from a simple SELECT statement, then the input data is implicitly a single row zero columns wide (i.e. N=1 and M=0).

If a FROM clause is specified, the data on which a simple SELECT query operates comes from the one or more tables or subqueries (SELECT statements in parentheses) specified following the FROM keyword. A subquery specified in the table or subquery clause following the FROM clause in a simple SELECT statement is handled as if it was a table containing the data returned by executing the subquery statement.

If there is only a single table or subquery in the FROM clause (a common case), then the input data used by the SELECT statement is the contents of the named table. If there is more than one table or subquery in FROM clause, then the contents of all tables and/or subqueries are joined into a single dataset for the simple SELECT statement to operate on. Exactly how the data is combined depends on the specific JOIN clause (i.e., the combination of JOIN operator and JOIN constraint) used to connect the tables or subqueries together.

When more than two tables are joined together as part of a FROM clause, the JOIN operations are processed in order from left to right. In other words, the FROM clause (A join-op-1 B join-op-2 C) is computed as ((A join-op-1 B) join-op-2 C).

JOIN clause


JOIN from_clause [ ON on_expression | USING ( column_name [, ...] ) ]


Only simple JOIN clauses are supported in Tableland. This means that all joins in Tableland are based on the cartesian product of the left and right-hand datasets. The columns of the cartesian product dataset are, in order, all the columns of the left-hand dataset followed by all the columns of the right-hand dataset. There is a row in the cartesian product dataset formed by combining each unique combination of a row from the left-hand and right-hand datasets. In other words, if the left-hand dataset consists of  rows of  columns, and the right-hand dataset of  rows of  columns, then the cartesian product is a dataset of  rows, each containing  columns.

If the JOIN operator does not include an ON or USING clause, then the result of the JOIN is simply the cartesian product of the left and right-hand datasets. If the JOIN operator does have ON or USING clauses, those are handled according to the following rules:

  • If there is an ON clause then the ON expression is evaluated for each row of the cartesian product as a boolean expression. Only rows for which the expression evaluates to true are included from the dataset.
  • If there is a USING clause then each of the column names specified must exist in the datasets to both the left and right of the JOIN operator. For each pair of named columns, the expression "lhs.X = rhs.X" is evaluated for each row of the cartesian product as a boolean expression. Only rows for which all such expressions evaluates to true are included from the result set. The column from the dataset on the left-hand side of the JOIN operator is considered to be on the left-hand side of the comparison operator (=) for the purposes of comparison.
  • For each pair of columns identified by a USING clause, the column from the right-hand dataset is omitted from the joined dataset. This is the only difference between a USING clause and its equivalent ON constraint.

Data Types

Tableland supports a small set of accepted column types in user-defined tables. The currently supported types are listed below and can be used to represent most, if not all, common SQL types:

Signed integer values, stored in 0, 1, 2, 3, 4, 6, or 8 bytes depending on the magnitude of the value.
Floating point values, stored as an 8-byte IEEE floating point number.
Text string, stored using the database encoding (UTF-8).
A blob of data, stored exactly as it was input. Useful for byte slices etc.
Any kind of data. No type checking is performed, no data coercion is done on insert.


When creating tables, every column definition must specify a data type for that column, and the data type must be one of the above types. No other data type names are allowed, though new types might be added in future versions of the Tableland SQL specification.

Content inserted into a column with a data type other than ANY must be either a NULL (assuming there is no NOT NULL constraint on the column) or the type specified. Tableland will attempt to coerce input data into the appropriate type using the usual affinity rules, as most SQL engines all do. However, if the value cannot be losslessly converted in the specified datatype, then an error will be raised.

Columns with data type ANY can accept any kind of data (except they will reject NULL values if they have a NOT NULL constraint, of course). No type coercion occurs for a column of type ANY.

Common Types

For users looking for more nuanced data types in tables, the following set of recommendations will help guide table schema design. Additionally, new types might be added in future versions of the Tableland SQL Specification, and users are able to make requests/suggestions via Tableland TIPs.


Tableland represents all character/text types using the single variable-length TEXT type. Although the type TEXT is not in any SQL standard, several other SQL database management systems have it as well. You can store any text/character-based data as TEXT. Additionally, more complex data types such as dates, timestamps, JSON strings, and more can be represented using TEXT (or in some cases BLOB).


Numeric types consist of integer and floating-point (real) numbers. In practice, two-, four-, and eight-byte integers are all represented by the INTEGER type, and their storage size depends on the magnitude of the value itself. Conversely, all float/real types are represented by the REAL type, and unlike with integer types, all floating point values are stored as an 8-byte IEEE floating point number.


Tableland does not have a separate data type to represent boolean values. Instead, Tableland users should represent true and false values using the integers 1 (true) and 0 (false).


Tableland does not have a storage class set aside for storing dates and/or times. Instead, users of Tableland can store dates and times as TEXT, REAL, or INTEGER values:

  • TEXT as ISO-8601 strings.
  • REAL as Julian day numbers.
  • INTEGER as Unix Time (number of seconds since (or before) 1970-01-01 00:00:00 UTC).

Applications can choose to store dates and times in any of these formats and freely convert between formats using the built-in date and time functions. Tableland currently supports the six date and time functions provided by the SQLite database engine.

Feature At Risk: Note that these date and time function have not yet been formalized into the Tableland SQL language specification. You are welcome to use them for now, but they should be considered unstable features.


JSON data types are for storing JSON (JavaScript Object Notation) data, as specified in RFC 7159. In practice, Tableland stores JSON as ordinary TEXT. Users are able to manipulate JSON data using a number of functions that make working with JSON data much easier. For example, the json(X) function verifies that its argument X is a valid JSON string and returns a minified version of that JSON string (with all unnecessary whitespace removed). If X is not a well-formed JSON string, then this routine throws an error. The JSON manipulation functions supported by Tableland is derived from SQLite, which in turn is generally compatible (in terms of syntax) with PostgresQL. Tableland currently supports the 15 scalar functions and operators and two aggregate SQL functions for JSON data provided by the SQLite database engine.

Feature At Risk: Note that these JSON scalar functions, operators, and aggregate functions have not yet been formalized into the Tableland SQL language specification. You are welcome to use them for now, but they should be considered unstable features.

← Previous