From:Steve Adams
Date:13-Dec-2000 00:21
Subject:   CBO default assumptions without statistics

You can probably rely on CBO doing a reasonable job without statistics, as long as your schema is well indexed. However, if your instance optimizer_mode is set to RULE, you can maintain statistics without any risk of getting CBO where you don't want it. That would be a safer approach in the medium term. Long term you should look to migrating to CBO.

Thanks Steve for the detailed explanation. Our database has RULE based optimizer setup. But when ever I give a hint FIRST_ROWS in queries which are slow, I get excellent response. Our tables are not analyzed. Can we really rely on this performance in this scenario. The database is growing very fast, but we have a 10 minutes outage every night, so I think the library cache problem will not be there.

The details are version specific, but you can work out many of the defaults from 10053 trace files. For example, here are the assumptions for a small table at 8.1.6.

Table stats    Table:  A   Alias:  A
  TOTAL ::  (NOT ANALYZED)    CDN: 2854  NBLKS:  141  TABLE_SCAN_CST: 22 AVG_ROW_LEN:  100
Note that Oracle has examined the segment header block to get the exact number of blocks below the high-water mark (NBLKS) even though there are no statistics. The average row length is assumed to be 100 bytes. The number of rows in the table (CDN) has been estimated as 141 * 2024 / 100. Where 2024 is the database block size less the cache header size (kcbdtl).

One twist here is that the segment header block is only read in this way for the first query against the table, and the assumed cardinality is then cached in the library cache definition of the table as the number of rows. Oracle will continue to generate execution plans based on this assumed cardinality, even if the table high-water mark changes significantly. That of course can bias the optimizer to full table scans inappropriately. If this happens, you can force Oracle to invalidate the cached table definition by granting and then revoking a permission on the table to some user.

Here are the default index statistics for a not-null column index on the same table. Note that the optimizer assumes a 1 level index B*-tree with 25 leaf blocks regardless of the actual size of the index. The other assumptions are also invariant. Needless to say this biases the optimizer towards index based plans for large tables.

  INDEX#: 3725  COL#: 1 
    TOTAL ::  LVLS: 0   #LB: 25  #DK: 100  LB/K: 1  DB/K: 1  CLUF: 800
There are also defaults for predicate selectivity in the absence of column statistics. Here we can see a default assumption of 89 distinct values, which equates to a selectivity of 0.011 for equality predicates.
Column:       COL2  Col#: 1      Table:  A   Alias:  A
    NO STATISTICS  (using defaults)
    NDV: 89        NULLS: 0         DENS: 1.1212e-002
Other default selectivities of interest are 0.05 for an unbounded range or inequality predicate, and 0.0025 for a bounded range. Default selectivities also apply for predicates using bind variables even when statistics are available. For example, an equality predicate for a bind variable against a non-unique column has a default selectivity of 0.009, and a LIKE predicate for a bind variable against a character column has a default selectivity of 0.25.

Of course, this is all very much version specific and subject to change without warning, so it would be most unwise to rely on any particular optimizer behaviour surviving to the next version!

When I put a hint FIRST_ROWS to a SQL I guess Oracle uses cost based optimizer. If the tables are not analyzed and the statistic columns are all NULL, what figures does Oracle use while defining the execution plan?