From:Steve Adams
Date:27-Mar-2001 11:26
Subject:   Snapshot tool old - will increasing the buffer cache help?

Assuming that process B does not open its cursor until process A has completed, and that no further modifications to the data that it is reading are in progress, then your guess about delayed block cleanout failure is probably right. For that to be true however, there must be enough undo generation during the first running of process B to cause rollback segment extent reuse before process B can complete. Also given that a transaction is not marked with an upper bound commit SCN when delayed block cleanout fails with ORA-1555, for process B to succeed when run again, either there must be significantly less undo generation at that time, or the blocks in question must have been cleaned out by other short duration queries in the interim. This is not impossible, but I have a hunch that there is more to this than the simple delayed block cleanout failure case that you have suggested. Nevertheless, I'll proceed on the assumption that you are correct.

The threshold for commit cleanout is 10% of the buffers in the cache, which in your case is 6000 buffers. With each transaction only modifying 1000 rows, it is unlikely that you are exceeding that threshold unless there is a lot of index maintenance involved. Even then, you are unlikely to exceed that threshold because you have 16 concurrent sessions all modifying blocks. If your snapshot too old errors are indeed due to delayed block cleanout failure, it seems unlikely that an increase in db_block_buffers will enable commit cleanouts to succeed consistently, thus obviating the need for delayed block cleanout.

It could be that you are defeating commit cleanout with incremental checkpointing. If you have set db_block_max_dirty_target (or fast_start_io_target or log_checkpoint_interval under 8i) then DBWn could be writing the dirty blocks before their transactions commit. If so, they will not be cleaned out at commit, and subsequent delayed block cleanout would be necessary. If multiple sessions for process A can visit the same block, this effect could be exacerbated, because the placement of a buffer on the checkpoint queue depends on the time of its first modification. So if a session modifies a block that is already close to the head of the checkpoint queue because of an earlier modification, then it is easy to see how that block could be written before the transaction commits. Indeed, this could happen even for log switch checkpoints. In either case, a change in the size of the buffer cache will not make any difference.

Other than preventing extent reuse during the first running of process B (which is a sure-fire solution) your best bet if this is indeed a delayed block cleanout failure problem is to ensure cleanout in a separate step between processes A and B.

We have been getting "snapshot too old" error consistently on a process. The processing scenario is as follows:

  1. Process A runs and updates over 3 million rows which span over 175,000 data blocks. This process has been split into 16 sessions which run concurrently. The commit frequency is every 1000 rows.
  2. Process B runs next, single threaded, reading the rows that are updated in step 1. This process always fails with "snapshot too old" error and when rerun completes to success.

My guess is that we are faced with a delayed block cleanout situation because our buffer cache is only 60,000 blocks and we are updating over 175,000 blocks with over 3000 transactions. Also, the block size is 8k and we have 40 RBS with 2MB extent size in an LMT. Each RBS is created with 20 min extents and an optimal size of 40 MB.

Would increasing the buffer cache to 150,000 blocks alleviate this problem? If not, what would be the solution?