Questions and Answers

Buffer Cache


Which table is being scanned

6 June 1999

We had a situation where one of our programs was running forever, so it must have been doing full table scans. We took a guess and rebuilt the indexes, which fixed the problem. However, is there some way of finding out from the V$ tables the table name that is having a full scan performed against it?
? If a full table scan is being performed, you can query V$SESSION_WAIT for the P1 and P2 values of the db file scattered read waits. These represent the file and block numbers being read. You can then query DBA_EXTENTS to find out which segment is being scanned.


20 August 1999

I'm confused: One chapter in the Oracle 8 Tuning guide appears to say that the maximum number of db_block_lru_latches is 6 times the number of CPUs, while another seems to recommend only 2 times number of CPUs, and it then goes on to recommend setting it to less than the number of CPUs! So, which is it?
? The CPU count related limit is release specific:

7.3 ... 2 * cpu_count
8.0 ... 6 * cpu_count
8.1 .. 12 * cpu_count
There is also a limit of db_block_buffers / 50.

On a system with only a few CPUs the number of latches needed will be driven by your use of the KEEP and RECYCLE buffer pools, and the number of db_writer_processes. In such case you may need several times more latches than you have CPUs.

On systems with lots of CPUs and lots of activity, the number of latches will be driven by the need to prevent latch contention. In such cases a setting equal to half the CPU count should be good enough.



28 September 1999

I've read that _small_table_threshold is the parameter which decides whether a table is a small table or not, and that it also decides the number of blocks to be kept in most recently used (MRU) end of the cache in case of full table scans. I want to know the value of _small_table_threshold value by default. The documentation I have says that the default is 5 blocks. But when I queried X$KSPPCV for the current value, it shows a value of 1280.
? This is the third time I have got this question in 24 hours! The default is not 5 blocks; it is 2% of db_block_buffers with a minimum of 4 blocks. The effect is exactly as you describe. Small tables are cached at the MRU end of the cache by default. The statistic is incremented for all logical small table scans, even against fully cached tables. It is also incremented for scans of small tables for which the NOCACHE keyword has been specified. It is not however incremented for large tables for which the CACHE keyword has been specified.

This parameter is session modifiable and system modifiable with deferred semantics. If changed dynamically, the parameter name needs to be enclosed in quotes to protect the leading underscore, as follows.

alter session set "_small_table_theshold" = 100;


buffer busy waits

25 October 1999

I have the following V$WAITSTAT statistics on a heavy inserting instance.
CLASS                   COUNT       TIME
------------------ ---------- ----------
data block             350262     231895
segment header           5701      16139
I increased the number of database writers from 1 to 4, but it did not help much. The key tables already have 10 freelists.
? To make progress on this issue, you need to catch the p3 value for the buffer busy waits events, preferably by enabling event 10046, level 8 in an affected process and examining the trace file. Our trace_waits.sql script may help.

If p3 is 0, it indicates that a process waited for a consistent read buffer because the buffer was in READ state. If p3 is 1014, the process needed the buffer in current mode but found it in READ state. In either case, this should be addressed by ensuring that the segment concerned is in the KEEP buffer pool. The segment can be identified by looking up the extent containing the relevant file (p1) and block (p2) numbers in DBA_EXTENTS.

Otherwise, p3 is likely to be 1012. If you are doing discrete transactions, then you are receiving due recompense for your sins. Otherwise, find the segment name and type, and if it is a table either increase process freelists if it is insert intensive, or reduce row density per block otherwise. If it is an index, consider making it a reverse key index if the inserted keys are monotonically increasing.

If it is none of the above, then you have an unusual situation, and you may need to get a performance specialist to look at it. :-)


Full table scans and multiblock reads

27 October 1999

As per the tip on your web site, I decided to test the actual db_file_multiblock_read_count being used by Oracle. I got the actual value of the parameter, but what surprised me was the occurrences of the db file sequential read event in the trace output.

After looking at the output, I realized that Oracle had to skip a block during the read. In other words, Oracle only needed one block, so it used sequential read, then it skipped one block, then again it read one block by sequential read.

Could you explain why Oracle skips blocks and uses sequential reads?

? The skipped blocks are those that were already in cache. These blocks cannot be read again, because the copy in cache may be newer than the one on disk. If only a single block is being read, then the wait is a db file sequential read wait.

Note that it is most efficient to do a full scan on a table that is fully cached, or failing that, one that has no cached blocks at all. Occasional cached blocks just get the way and break up the multiblock reads. Therefore tables that are routinely accessed by full scans should go into either the KEEP or RECYCLE buffer pools, not the default one. If you do have a large active table that sometimes needs to be accessed by a full scan, then it is best to do it in parallel even if you would not normally want to do so on other grounds. This is because parallel table scans do direct reads, and so the multiblock reads are not fragmented by the presence of cached blocks. However, this should only be done for large tables, because it places some load on DBWn, which has to checkpoint the blocks for the object prior to the direct read operation.


Full table scans

1 November 1999

Is true that the buffer cache isn't used for full table scans? I have increased the db_block_buffers from 200 to 30000, and the cache hit rate has only gone up from 50% to 53%.
? Table scans do distort the cache hit rate. If done in parallel, they use direct I/O, which means that the physical reads are counted, but there are NO logical reads (consistent gets + db block gets). Understandably, this distorts the normal cache hit rate calculations.

Even serial table scans, which are performed through the cache by default, distort the cache hit rate significantly, because they get very few cache hits if any.

To get a better indication of whether there is room for improvement in the cache hit rate, you should attempt to calculate the cache hit rate exclusive of full table scans (and similar operations). Please see our tip on Calculating the Cache Hit and Miss Rates.


CACHE and cache_size_threshold

1 November 1999

Can a poor cache hit rate be due to setting the CACHE keyword on a large table?
? No, when big tables are scanned, the blocks are read in the MRU end of the cache if the CACHE directive has been specified, but ONLY until such time as the number of blocks specified by the cache_size_threshold parameter have been read. Thereafter, further blocks will be read into the LRU end, regardless of the CACHE directive.

The cache_size_threshold parameter defaults to 10% of db_block_buffers. If you use the CACHE directive, it is wise to check routinely that those tables have not grown beyond this threshold.


Table scan and index reads

3 November 1999

Is there a way to find out the number of blocks read through full table scans and number of blocks read through index reads - for each tablespace or datafile?
? Try taking the ratio of blocks read (PHYBLKRD) to read operations (PHYRDS) from V$FILESTAT. If all reads against the datafile have been full table scans, then the ratio will be close to the multiblock read count. If there have been no full table scans, then the ratio will be exactly 1. Any number in between implies a mix of table scans and single block reads. However, note that the multiblock reads for table scans are not always equal in size to the multiblock read count parameter, so care is needed when interpreting the ratio if it has an intermediate value.

cache buffers chains latch contention

12 November 1999

I have two long-running processes and they are competing on the cache buffers chains latches. Is there anyway to reduce the contention or solve the problem, as they are running very slowly? This is Oracle
? Have a look in V$LATCH_CHILDREN to see whether the distribution of sleeps is even. The chances are that it is not, and that the contention is a side effect of bad SQL statements accessing the same few blocks repeatedly and intensively in a deeply nested query. Look at the execution plans of the queries and see what you can do to improve them.

cache buffers chains latches

18 November 1999

I have about a 5% miss rate on the cache buffers chains latches that I am trying to understand. When I look in V$LATCH_CHILDREN, I see 1153 latches for 4608 buffers - about a 1 to 4 ratio. Then joining to X$BH, I find that many of these latches manage only 2 or 3 blocks, but some have as many as 9 blocks. How are child latches assigned to blocks?
? Well done with what you've managed tracing things so far. You should try to narrow down your join to X$BH to show you only the blocks on the child latches with the most sleeps, and then use DBA_EXTENTS to find to which segments they belong. The allocation of blocks to latches is based on a hash function computed from the FILE# and BLOCK#. In your case the latch contention is most likely caused by some bad SQL. Look for something doing a lot of buffer gets, and accessing the offending database segment in the middle of a deeply nested execution plan.


18 November 1999

What does the _db_block_cache_protect parameter do?
? It uses the mprotect() system call to disable access to the block buffers. Every legitimate access must then enable access for the buffer it needs and disable it again afterwards. This protects you against bugs or hackers corrupting the cache, but it is not really necessary because the log_buffer has similar protection by default. It also adds about a 30% overhead to system performance. It should only be used by Oracle when diagnosing memory corruption bugs.

Of course, this parameter does nothing on platforms that do not support memory protection, or that support process-based memory protection but have a thread-based Oracle implementation (NT).


cache buffers chains latches

19 November 1999

We have some contention for the cache buffers chains latches. There are 700 child latches, and the top 3 have about 16% of the sleeps. I can find the individual buffers covered by each of the child latches by joining V$LATCH_CHILDREN.ADDR to X$BH.HLADDR, but how do I find out which buffers are hot, and what would be the best course of action for hot buffers? Should the _db_block_hash_buckets value be increased?
? I presume that you are focusing your examination of the child latches on the few that suffer the greatest number of sleeps. If so, you know that at least one of the buffers protected by each latch is hot, by virtue of the sleeps. You can work out which ones by repeating the X$BH query a few times until you can eliminate the others by virtue of the fact that they are not always in the cache. Typically you will find that the hot blocks are index root and 1st level branch blocks. This is a symptom of intensive access of that index in concurrent nested loops joins. Your best course of action is to tune the SQL. Do not bother to change _db_block_hash_buckets unless you are unlucky enough to have more than one hot block hashing to the same latch.


20 November 1999

I have read that setting the _db_block_hash_buckets parameter to be equal to the number of db_block_buffers will ensure direct access to all data buffers, eliminating linear searches. Is this advisable, or is the default value of 0 is good enough?
? Prior to Oracle 8.1, the default value for _db_block_hash_buckets is actually the least prime number greater than or equal to db_block_buffers / 4. There is rarely any benefit in adjusting this parameter. The only case is that in which there is an unlucky coincidence of hot blocks on a particular hash bucket, but that can also be resolved by a small change to db_block_buffers itself. However, if you have a lot of hot blocks, then a larger value for _db_block_hash_buckets is more likely to hash them to distinct latches. I once actually set this much higher than db_block_buffers itself, in order to get exactly that effect.

The suggestion that the proposed change would result in direct access is not correct.


Caching of index-organized tables

7 December 1999

When an IOT is read into memory, is it like an index or a table in terms of the LRU and aging out memory?
? An IOT is physically an index, and is cached accordingly. However, indexes can be accessed via fast full scans, in which case they are treated just like full table scans in regard to caching.

Free buffers

9 December 1999

I have read in one of the performance books that "If free buffers are not available (none with STATE=0) within the first 30 minutes of business hours, you probably need to set db_block_buffers higher." I have checked, and we have no free buffers within a minute of startup, but my buffer cache hit ratio is already better than 99%.
? The advice that free buffers should be available after a certain period is wrong prior to 8.1, and dubious for 8.1.


9 December 1999

I'm curious about the OBJ column in X$BH. I've been joining it to OBJ$.OBJ# to get objects in the buffer cache, but a large quantity of rows do not get a match. Is there an explanation for these rows?
SQL> select obj, count(*)
  2  from x$bh
  3  where not exists ( select null from obj$ where obj# = x$bh.obj )
  4  group by obj;

       OBJ   COUNT(*)
---------- ----------
     25295       1055
     25296       1711
     25297         41
4294967295         16
? Try joining X$BH.OBJ to OBJ$.DATAOBJ#. That should give you matches for most of the buffers, but it will give you multiple matches for clusters containing multiple tables. It is possible to have buffers that don't match a data layer object number too, for example after a truncate or drop, however, that is not your problem here. The data object numbers are managed using the _NEXT_OBJECT row in OBJ$ via recursive transactions. Therefore you would be most unlikely to catch a mismatch due to transactional semantics, however, that too is possible. The 0xffffffff object number is used for rollback segment header and data blocks, temporary segment blocks, and the compatibility (cache bootstrap) segment.

Current mode block gets during a query

23 December 1999

Why do I see current mode gets when just doing a query? I checked out trace file and that showed that it's at the time the first fetch is going on.
? At the start of a query, current mode block gets are needed for the segment header blocks of all segments involved in the execution plan. Thereafter, any current mode gets are probably against rollback segment header and data blocks needed for delayed block cleanout of old row-level locks in the blocks being queried.

Undo block reads

25 January 2000

Are rollback segment data blocks read into the database block buffers like those of any other segment?
? Rollback segment data blocks are not normally read into the buffer cache prior to being changed. They are merely "newed". That is, a free buffer is requested and is formatted as the target block. The redo record contains a change vector with an opcode representing this operation. Other than this, there is nothing special about the treatment of undo blocks in the cache.

Poor cache hit rate

5 February 2000

After increasing db_block_buffers to four times its original size, the buffer cache hit ratio is still about 55% and not improving. Any suggestions?
? I presume you've seen our tip on Calculating the Cache Hit and Miss Rates, and that it is not a calculation error. If so, then you probably have some rather poor SQL statements that need help, possibly with the addition of an index or two. Try our expensive_sql.sql script to find them. Also, why not take a full utlbstat / utlestat report and send it to me.
? Here is the report. It is winzippped.
? You are getting more than 90% of your data via full table scans. May I repeat my recommendation to use expensive_sql.sql to find the offending statements and work out how to fix them.

buffer busy waits and V$WAITSTAT

5 February 2000

What is the difference between buffer busy waits as reported in V$SYSTEM_EVENT and the information that is reported in V$WAITSTAT? Does the total count of waits from V$WAITSTAT define the buffer busy waits in more detail?
? That's right, V$WAITSTAT is just a breakdown by buffer class of the buffer busy waits shown in V$SYSTEM_EVENT.

buffer busy waits

9 February 2000

We have some big tables that all users insert to and update. I noticed some waits on data block and segment header blocks. I could recreate the tables with more FREELISTS, but I can't find a block with an ITL count higher than 1 to get the highest number of concurrent transactions on the table. If I increase the number of FREELISTS, does it help even if it is not as high as the number of concurrent transactions? Is there a drawback in specifying many FREELISTS?
? I assume you've been doing some block dumps to look at the ITL count. Well done. That means that you have understood a lot of the issues here already. Increasing the number of FREELISTS will not hurt, but in this case it will probably not help either. There is a second reason for data block waits that you should consider, namely that of reference data that is not being cached in a KEEP buffer pool being simultaneously required by two sessions. You can identify these from the p3 parameter value of the waits, which you can get with our trace_waits.sql script. The segment header contention is possibly caused by having PCTFREE and PCTUSED too close together on some important tables.


9 February 2000

If I cache a table, does it take memory from db_block_buffers, or is it cached somewhere else?
? Despite the fact that the CACHE keyword makes it sound like the table will be kept in memory, that is not what it means. If you want that functionality, you need to set up a KEEP buffer pool and assign the table to that buffer pool.

The CACHE keyword just alters the retention of blocks read for full table scans. By default, the blocks from small tables are retained in cache subject to normal LRU semantics. But blocks from large tables are immediately reused. "Small" means less than or equal to 2% of db_block_buffers (or 4 blocks for a cache of less than 200 blocks). Whereas, if the CACHE keyword or hint has been used, then blocks from scans of large tables are retained in cache (not immediately reused) until the number of blocks read reaches the cache_size_threshold, which is 10% of db_block_buffers by default. Note however that blocks thus retained are not kept in cache indefinitely. They are subject to normal LRU semantics. The only difference is that otherwise, they would have been immediately reused.

? Is the CACHE keyword redundant in Oracle8, because of the introduction of the KEEP buffer pool?
? It is redundant in 8.1, and the cache_size_threshold parameter is obsolete, because of the radical changes in the way that the buffer cache is used. However, at 8.0 the CACHE keyword must still be used for large objects that are intended to be read into a KEEP buffer pool via full scans. Otherwise, each multiblock read will reuse the same set of buffers, and the segment will be neither cached nor kept, except for the last couple of blocks. Other than that, the CACHE keyword should no longer be used under Oracle8.


15 February 2000

X$BH.DBAFIL shows that I have around 1200 blocks in cache from my rollback segment datafile. At about the same time, however, X$BH.CLASS only shows 1 or 2 blocks in classes 9 and 10. These are the classes for rollback segment header and data blocks, right?
? No, the block class for undo blocks depends on the segment number. Under Oracle7, it is 7+2n for the header, and 8+2n for the data blocks, where n is the rollback segment number. Under Oracle8, it is 11+2n for the header, and 12+2n for the data blocks.

cache buffers chains latches

16 February 2000

By looking at V$LATCH_CHILDREN, I have found that less than 1% of the cache buffers chains latches account for 30% of the gets and 90% of the misses. Obviously, the operations on these latches are highly skewed. One latch is particularly bad. However, given that the hit rate on cache buffers chains latches is about 99.3%, should I be concerned about the highly skewed distribution of misses?
? If the sleeps represent a large proportion of all latch sleeps, and if you have significant latch free waits in V$SYSTEM_EVENT, then the problem should be investigated further, regardless of the hit ratio. Another reason why you should pursue this further is that most such distributions of gets and sleeps against these latches are in fact a symptom of bad SQL performing huge numbers of buffer gets. If you can find and fix that, you may get a much bigger performance boost than that which relates merely to avoiding the latch sleeps.

LRU list position

7 March 2000

Is it possible to get the position of a buffer on the LRU list?
? Yes, with difficulty, but its worth the effort if you want to understand the cache behavior. For simplicity, it is probably best to set the db_block_lru_latches parameter to 1 to ensure that you only have a single LRU chain to look at. There are two ways to look at the order of the buffers. One way is to take an immediate dump of the BUFFERS structure at level 4, and then to "join" the two relevant sections of the trace file using awk or perl. The thing to be note is that the replacement (LRU) list pointers actually point to a location 8 bytes into the buffer header, and so you need to allow an 8 byte offset in join condition, otherwise nothing will match. The second approach, which is the one I actually use, is to spool the output of the following query (with appropriate casting on recent versions) to a file,
select set_ds, dirty_queue, addr, nxt_repl - 8, prv_repl - 8, obj, file#, dbablk from sys.x_$bh;
and sort on the first two fields to group the buffers in each working set and buffer list together. Then for each buffer list, the following awk code is used follow the pointers to find the order of the blocks.
awk '{
    addr[NR] = $1
    prv[NR] = $2
    nxt[$1] = $3
    obj[$1] = $4
    fil[$1] = $5
    blk[$1] = $6
    for (i = 1; i < NR; i++)
	if (! nxt[prv[i]])
    for (p = addr[i]; fil[p]; p = nxt[p])
	printf "%4d %5d %4d\n", obj[p], fil[p], blk[p]
Happy researching!

Buffer cache latches

24 March 2000

I am bit unsure of the difference between the cache buffer chains latch and the cache buffers lru chain latch. My understanding is that the latter one is used when a process requires a free buffer in the cache, but when is the cache buffer chains latch used and how?
? The key to understanding the difference is understanding the structures that they protect. The cache buffers lru chain latches protect the LRU chain pointers. The LRU chains are used to find free buffers, to move buffers to the MRU end when appropriate, and for various things associated with writing dirty blocks and checkpoints. The cache buffers chains latches protect the hash chains. The hash chains are used to access cached blocks via a hash function based on their file and block numbers.
Copyright Ixora Pty Ltd Send Email Home