|
| System CPU time | 3 October 1999 |
|
What tools do I use to determine what process causes increases in system time?
I am on Digital Unix 4.0d running on Alpha 8400.
Will increasing the min_free_list (or similar) or decreasing the UBC cache help reduce the system time from the following vmstat snapshot? I know you don't have enough info here ... but please point me in the right direction. procs memory pages intr cpu r w u act free wire fault cow zero react pin pout in sy cs us sy id 71109 33 231K 98 24K 225M 32M 63M 2M 23M 69K 152 7K 2K 11 4 86 71104 37 231K 105 24K 1966 113 1664 45 99 0 983 27K 7K 29 22 48 111105 35 231K 96 24K 3982 240 1182 140 130 0 1K 31K 9K 23 26 51 91103 37 231K 119 24K 15K 3177 6147 0 2514 0 928 35K 7K 40 42 18 111105 35 231K 181 24K 4975 486 1745 10 372 0 806 40K 9K 41 28 31 101103 34 231K 156 24K 6618 519 2760 514 354 0 728 27K 6K 36 27 36 51108 34 231K 176 24K 3896 109 1350 8 181 0 633 23K 6K 31 24 46 111103 32 231K 310 24K 2208 149 1314 4 129 0 590 29K 7K 32 28 40 81104 34 231K 137 24K 2989 248 1351 3 169 0 679 26K 6K 46 19 34 | ||
|
A common misconception about system mode CPU usage is that it is due to the activity of system processes rather than user processes.
This is not correct.
The majority of system time is normally accounted for by user processes executing in system mode.
Whenever a user process performs a true system call,
the CPU switches into system mode in order to be able to access and update kernel data structures
that the user process would not otherwise be able to see or change.
For example, a simple read or write system call uses a fair amount of system mode CPU time.
The sy column under the intr heading in vmstat shows the number of system calls per second over the interval. If all system calls used exactly the same amount of CPU time, then there would be a perfect correlation between that column and the system mode CPU usage column. However, the CPU usage of a system call is heavily dependent on what call it is, and on the kernel workload at the time, and various features of the user environment. If system mode CPU usage seems out of proportion with user mode CPU usage, you need to narrow down which factor is to blame for the difference. It can be an I/O intensive activity such as backups using a lot of system mode CPU time for read and write calls. It can be due to contention for a kernel data structure such as the semaphore data structures because of heavy concurrent load. This would result in each semop() call using more CPU time than normal. Or it can be that fork() calls are taking a long time because the parent process has a badly fragmented memory map, due to a history of poor memory management within the process. I suppose what I am saying is that this is a very difficult thing to track down, and fix. Firstly, you need to have a good idea as to what is the "normal" user to system CPU usage ratio for your system under a particular load. If you then notice a variation, your first step should be to see whether it can be accounted for by just one or a few processes. If not, you can try to trace which type of system call is responsible. Under HP-UX, the kernel is instrumented to allow this - I'm not sure whether it is possible under Digital Unix, or how you would do it. If you can nail it down to a particular process, you may them be able to profile that process and see where it is burning its CPU time. |
|
| Process binding | 15 October 1999 |
| I'm going to be involved with the implementation of a batch processing systems on a four or six way HP-UX 10.20 box using Oracle 7.3.4. The system consists of three processes that receive files and translate them to a different format. The translated files are then loaded into the database by three daemon-like processes, and the data is then validated and manipulated. In order to minimize the context switching overhead, we are going to bind the translate processes to one or more processor. I have read in Oracle8 & Unix Performance Tuning that you should bind the background Oracle server processes, with the exception of DBWR and LGWR. Do you concur with this? | ||
|
If I understand you correctly, the translate process is C code that you are developing,
and you are intending to use mpctl () to do the binding?
If so, you should check with HP, because last time I talked with them about this,
I was told that it was undocumented and unsupported - that is, not for customers to use.
I disagree strongly with the idea of binding an Oracle process that is not also in the real-time priority class. However, I have seen dramatic improvements in performance in some cases from making the key Oracle background processes real-time without binding. Foreground Oracle processes should not be bound or made real-time, unless the entire instance is real-time. |
|
| Process binding | 5 November 1999 |
|
Could you elaborate on your comments in your 15 October answer on process binding?
I am interested in the part where you said,
"I have seen dramatic improvements in performance in some cases from making the key Oracle background processes real-time".
I thought Oracle always recommends leaving all priorities at default. Which background processes have given rise to the observed improvement? Why does it work? Do you invoke it via HP's rtprio command? | ||
| I have done this with LGWR and DBWR in most cases. In fact, I did it just last week on a SAP site still running 7.2.3 on HP. Yes, using rtprio. The rationale is explained in my book. Basically, it reduces the IPC latencies significantly. |
|
| log file sync waits | 24 November 1999 |
| We are benchmarking an insert-intensive application, using Oracle 8.0.5 on HP-UX. We would like to get every ounce of performance possible. Presently our big issue is log file sync waits. What can be done about this? A full report.txt is attached. | ||
| Firstly, try to limit commits to an absolute minimum in the application. Commit only where it is essential for the application logic. Also, consider running LGWR as a real-time priority process. Use rtprio with a priority of 60. If possible, use hardware mirroring for the online log files, rather than Oracle log file multiplexing. Also, check to make sure that the online log files are raw and on dedicated disks. Use our hold_logs_open.sh script to improve log switch performance. There are a lot of other issues here, but that should keep you busy for a few days. Note that the rtprio change is very important. |
|
| Single-task export | 26 January 2000 |
| I remember reading somewhere that one can speed up exports by running them in single-task mode. Does it work, and if so, Why? | ||
| In normal Oracle connections there are two processes: the client process which has no special permissions, and the shadow or server process which runs as the Oracle owner and has operating system permissions to attach to and modify the SGA, read and write the data files and so on. If the two processes are running on the same box, communication between these two processes is bequeathed by SQL*Net to the operating system inter-process communication (IPC) facilities. However, IPC involves scheduling latencies. While one process is working, the other is waiting for it, and vice versa. There is always a delay between one process passing the baton to the other, and that process being scheduled to run on a CPU. Single-task merges the functions into a single process, thereby eliminating the IPC latency. People have benchmarked between 5% and 15% savings on elapsed time when using single-task export. |
|
| 64-bit Oracle | 1 February 2000 |
| We are on HP-9000 V series boxes, running HP-UX 11.0. All production databases are on 7.3.4, 32-bit version. Would there be any major performance improvement in going for the 64-bit version of Oracle? | ||
| It would be worth going 64-bit if you want a VLM Oracle buffer cache to reduce disk I/O. Other than that, I would expect the impact to be minimal but positive. The reason for the difference would be that Oracle does all its expression evaluation using longs to maximize precision and then casts the result back to an integer. Long addition and multiplication is much faster in a 64-bit executable, although division is a little slower. On the other hand, the executable itself would be larger, and that might reduce TLB hits. |
|
| ora_kstat | 14 February 2000 |
One of our developers has just brought to our attention a line in the system startup file, inittab, that I have no clue about.
Does this look familiar?
orakstat:2:wait:/etc/loadext -l /etc/ora_kstat | ||
| Yes, this is associated with using the post-wait driver under AIX. |
|
| Tuning CPU usage | 16 February 2000 |
| I'm trying to tune a system that is CPU bound. In terms of the overhead of AIX having to create shadow processes, is there anything to be gained from moving from the conventional two task architecture to MTS? Also, do you know of any web sites that have decent Unix performance white papers, in particular AIX tuning material? | ||
| MTS will use more CPU, not less. In an Oracle environment, the relative cost of process creation is insignificant. Your three main strategies for reducing CPU usage should be to reduce physical I/O, buffer gets and parsing. For AIX tuning information, have a look at the chapter on Monitoring and Tuning CPU Use in the AIX Performance Tuning Guide. |
|
| Real-time LGWR | 18 February 2000 |
| You suggested that we make LGWR a real-time priority process to speed up our data loads. HP-UX has priority levels 0 (highest) to 127 (lowest). Should we use the lowest? Also, a Unix guy here is concerned about this. | ||
| Yes, 127 is fine, just as long as it is in the real-time class so that there is no priority degradation. I understand your Unix guy being concerned. If there were a bug in LGWR, it could chew up all of one CPU. On a single CPU machine, "best practice" is to have a higher priority real time shell running on the console - just in case. On your multi-CPU machine that is not really necessary. |
|
| NT performance glitch | 16 March 2000 |
|
I'm doing a bit of work on a high performance web system that can only be bounced about once every 6 weeks.
It's Oracle 8.0.5 on NT 4.0.
There is an odd performance glitch.
The average log file sync time on lots of the sessions is about 1 second, (which is the timeout time) but: a) The average log file parallel write time is about 1/100 seconds b) The redo allocation latch is not stressed c) The average log write size is about 2K When I try to stress the system, I don't get any trouble. Another interesting feature is that we are getting tens of thousands of buffer busy waits per day, also of about 1 second average. I can't think of any reason why buffer busy waits might be related to log file syncs, but the numbers are similar. Any thoughts? I get one chance to bounce the database next week, and it would be quite nice to know that the problem will disappear. | ||
| This might be an NT priority inversion problem. The priorities of runnable NT threads do not appreciate with time. Therefore, if the CPU is 100% busy, lower priority threads hardly get a look-in. By default the processes of Oracle threads under NT are in the variable priority class, which means that thread priorities are adjusted by the dispatcher according to its rules. This can mean that processes waiting to be posted don't see the post until their timeout expires. It can also mean that low priority processes holding a resource are unable to free that resource (until they get a brief random priority boost). The solution is to set ORACLE_PRIORITY in the registry, as documented in appendix C of the Getting Started manual. If this is your problem, you may want to consider using a low real-time priority for Oracle. |
| Copyright © Ixora Pty Ltd |
|