Considering Nulls and Indexes

It is common practice to allow columns for which the corresponding attribute value may sometimes not be available to contain a null. Null is not a value as such, but represents the absence of a value. Columns that may contain nulls are sometimes called nullable columns.

Instead of allowing a column to be nullable, it is possible to make the column mandatory and to define one or more special constants to be used in cases where the value is not available. For example, one or both of the following special constants could be used in a column that has a character data type.

'UN' Value Unknown
'NA' Attribute Not Applicable

Of course, you have to ensure that the special constants chosen fall outside the domain of valid values for the attribute.

The primary reason for using special constants rather than nulls is so that searches for missing values can be performed via an index. However, there are other factors that should be considered as well in each case. These are the considerations.

Table Size

Nulls take very little storage space, and none at all if they occur at the end of a row. Special constants increase the size of the table, and this may cause extra I/O to be required for table scans.

Index Size

If only a small number of rows are expected to have values, allowing the column to be nullable can greatly reduce the size of an index on the column because null keys are not indexed (except in cluster key indexes).

Searches for Available Values

The optimizer can evaluate IS NOT NULL predicates via an efficient fast full index scan. If special constants are used instead of nulls, a less efficient index range scan access path must be used to search for the other values.

Searches for Missing Values

Because null keys are not indexed, searches for nulls typically require a full table scan. Special constants are indexed and so enable searches for missing values to be performed via an index. In most cases the indexed access path is preferable to the full table scan.

Outer Joins

Special constants eliminate the need for outer joins. Outer joins constrain the optimizer's choice of join orders, whereas otherwise a full range of query plans is possible. Commonly the query plans enabled by the use of special constants outperform their outer join alternatives significantly.

Elegance

Some argue that the use of special constants rather than nulls leads to more readily understood code, and prevents the confusion often experienced by users (not to mention developers and DBAs) in relation to nulls and outer joins.

Inelegance

Conversely, others find the use of special constants inelegant, for example when using a -1 to represent an unknown quantity, or when using an epoch such as 1-JAN-1900 to represent an unknown date.

Copyright Ixora Pty Ltd Send Email Home