Indices

Indices are one of the reasons why RDBMSes are fast when retrieving data: they are built from the data in user-specified columns when rows are inserted into the database and are used when data is selected or retrieved, thus avoiding in most cases the necessity to do a full table scan when performing read operations. Indices cause insertion operations to be slightly slower, but can make data extraction operations and joins orders of magnitude faster. Indices can span multiple columns, and could even include all the columns (although such an index would be of limited use). Usually DBMS allow to define more than one index per table (the maximum number might be constrained).

Keys (or unique indices) can be seen as "a stronger kind of index". A key is an index which is also constrained to be unique: having two rows with the same key in a table is forbidden, for any key defined on the table. The interpretation the various databases give to this concept varies, however. For some (including MySQL), a key is merely an alias for "index". For others, indices can be used to enforce constraints but have no impact on data organization while keys do. For further information on your server's concept of keys, consult its manual.

One key is special, and is named "primary key". The data is usually put in storage in such a way that read operations involving only the primary key are even faster than operation involving keys or indices. It is also usually very slow to update, and is not allowed to contain NULL values.

Unique indices are the way provided by SQL to avoid duplicate rows, defining one that spans all the columns you wish to maintain unique, maybe even all of them. There can be multiple constraints, that can be expressed by defining multiple indices.

The syntax to create an index varies from RDBMS to RDBMS. However, there are two main syntaxes we'll explain here. Consult your server's documentation for details on the syntax it supports.

MySQL Syntax

MySQL has indices and keys definitions inside table creation clauses. The basic syntax is:

CREATE TABLE name ( declaration [, declaration ...])

where a declaration is either a column declaration, a key declaration or an index declaration. For columns declaration, see the Creating Tables page.

For indices, unique indices and primary key the syntax is respectively:

PRIMARY KEY (column [,column...]) UNIQUE INDEX index_name (column [,column...]) INDEX index_name (column [,column...])

The names for indices (unique or not) must be unique in a table (no pun intended).

This is the defininion for the "areas" table in the sample database:

CREATE TABLE areas (
  id tinyint NOT NULL auto_increment,
  name char(20) NOT NULL,
  PRIMARY KEY (id),
  UNIQUE INDEX name (name)
)

There are two constraints: the area id must be unique, as must the area name. Joins are made on the primary key for efficiency purposes.

Postgres Syntax

With PostgreSQL and other databases indices are seen not as part of a table definition, but are "external" entities attached to a table. They are created by a CREATE clause, whose basic syntax is

CREATE [UNIQUE] INDEX ON table (column [, column])

Primary keys are defined using the same syntax as MySQL.

The definition above would have been with PostgreSQL:

CREATE SEQUENCE areas_seq

CREATE TABLE areas (
  id tinyint NOT NULL DEFAULT NEXTVAL('areas_seq'),
  name char(20) NOT NULL,
  PRIMARY KEY (id)
)

CREATE UNIQUE INDEX unique_area ON areas (name)

Notice that recent versions of MySQL (3.22 and later) and PostgreSQL support both syntax styles.

Single-Column Primary Keys

If your table has a primary key spanning over a single column, you can declare it simply appending the "PRIMARY KEY" keyword to the column definition:

CREATE TABLE areas (
       id tinyint NOT NULL auto_increment PRIMARY KEY,
       ...
)

Notice that in most cases the PRIMARY KEY clause implies the NOT NULL clause.