Archiver Tuning

Your first priority when tuning archiving is to ensure that LGWR never gets stuck on a log switch waiting for ARCn to finish archiving a log file. Your second priority is to minimize the impact that ARCn has on foreground processes when it is active. Of course, these two priorities are opposed to one another. Our preference is to tune ARCn to run quickly, and only slow it down if there is evidence that it is impacting foreground performance.

In addition to following the disk configuration principles outlined in our series of database creation tips, the key areas to tune are the size and number of buffers available for archiving, and the number of ARCn processes.

What size buffers?

ARCn reads buffers of _log_archive_buffer_size, specified in log blocks, from the online log file, and writes them to the archive destinations. Therefore, ARCn's performance is maximized, and its load on the I/O subsystem is minimized if _log_archive_buffer_size is set to the maximum value possible under your operating system. If this parameter is unlimited on your operating system, then it is best set to several times the maximum physical I/O size.

How many buffers?

If multiple _log_archive_buffers are available and if asynchronous reads are possible from the online log files, then ARCn will use the aio_read() system call to read the redo into multiple buffers in parallel. If multiple log members are available, then parallel asynchronous reads will be addressed to distinct log file members in order to spread the disk I/O load. If hardware or software mirroring is used in preference to log file multiplexing, then similar load balancing is performed automatically by the operating system or hardware. Therefore, you should plan for the parallelism of archiving reads to be equal to the number of disks used for each log file, whether mirrored or multiplexed, and configure that number of _log_archive_buffers up to a maximum of three.

Note that multiple _arch_io_slaves cannot be used to simulate parallel asynchronous reads from the log file members. ARCn always performs this task itself and only uses the I/O slaves for writing.

To sustain asynchronous archiving writes if possible, two more _log_archive_buffers should be made available, over and above those required for parallel reading from the log files. This allows for maximum archival performance. However, if your system is so CPU bound so that foreground performance is noticeably affected during archival, and if there is no risk of archival backlogs, then the number of buffers may be reduced, to spread out the CPU impact of archival over a longer period.

How many processes?

As explained in Disk Configuration for Archived Log Files, it is normal to require multiple ARCn processes if heavy redo generation will be sustained, or if archiving to multiple destinations. From release 8.1, this is allowed for by setting the log_archive_max_processes parameter. Under earlier releases, it is necessary to schedule a regular job to run the ALTER SYSTEM ARCHIVE LOG ALL command. This command has very little impact if there is no archival backlog. It just takes a shared lock on the CF enqueue briefly. However, if an archival backlog is developing, this command effectively spawns an extra ARCn process to help catch up the backlog.

The ability to run multiple ARCn processes is your most important insurance against LGWR getting stuck behind archival backlogs. However, for this strategy to be effective, it is essential that the disks for the online files and archive destinations be configured appropriately.

Because manual archiving is always possible, Oracle holds an exclusive WL (writing log) enqueue lock on any online log file while archiving it, regardless of the log_archive_max_processes setting. Operations against these enqueues are protected by the archive control latch. Archival activity can be inferred from standard performance reports (such as those produced with utlbstat.sql and utlestat.sql) by examining the gets against this latch.

Are there any other issues?

Yes, there are two more dangers to be aware of that can impact ARCn performance.

In a parallel server environment, when an instance is not up, its thread is closed but remains enabled. If the current SCN of a closed, enabled threads falls behind the force SCN, then a log switch is forced in that thread and the ARCn process of an active instance archives the log file on behalf of the inactive instance. Of course, this distracts the ARCn process in that instance from doing its own job. While that ARCn process will suspend this work to archive a log for its own instance if posted by its LGWR, this should be avoided entirely either by keeping the idle instance up, or by disabling its redo thread.

Archival is also disabled entirely during the roll forward phase of instance and media recovery. So try to ensure clean shutdowns, and keep redo generation to a minimum during and shortly after online recovery.


Copyright Ixora Pty Ltd Send Email Home