When we recently upgraded the system a ProIV based application runs on from an 8 processor, 900mhz Sun v880 to a Sun 4800 (8 procs, 1200mhz), we started experiencing eratic behaviour which resulted in the sudden appearance of high kernel mode (80-90 %) during busy days. We are using 4.6r213, solaris 8 and oracle 9i.
As a result of this our application supplier suggested us to enable spinlocks, which are mentioned briefly in the ProIV documentation. Since then the kernel mode storm has appeared less frequently, but whenever it appears we have to restart the application for things to settle, which neither makes me nor the users happy.
So far I have not managed to reproduce the behaviour on any of our slower testsystems, however I am able to force that behaviour to happen on the production system by running some nightly batch job, which should run on a total of one processor, but causes up to 90% kernel mode on the entire system. The application is just unusable then.
If I copy the entire application (app+oracle) to a 900mhz machine, the same job consumes almost no kernel resources.
I have been collecting data using isview and the usual unix tools, but havent got a real clue yet.
Looking at /etc/isamdef and isview I have determined that there are actualy 4 spinlock parameters, SL_SPIN (default 100), SL_NAP (default 10), SL_TIMEOUT (default 50), SL_SLEEP (default 2)
Can anyone give me more detailed information about these than "set SL_SPIN=100, SL_NAP=10 on a multiprocessor system" as the Dokumentation does?
If I remove write permissions of the bootstrap files, I can make the inode based locks go away, but experimenting with this (so far on the testsystem only) has not allowed me to create any noticeable difference to what I have with.
If one upgrades to a faster machine, but with the same number of processors one should not experience suddenly increased contention, but we obviously do. Has anyone experinced something similiar yet?
regards,
faz

ProIV Spinlocks
Started by faz, Dec 21 2003 08:48 PM
5 replies to this topic
#2
Posted 22 December 2003 - 08:28 AM
Hi,
Just a few questions...
1) When you upgraded the box, I guess you also upgraded or installed the O/S??
2) What sort of user level do you have logged on when you get the problem?
3) Do you have many ProISAM files?
4) If you do an isview, how many lines are in the output?
5) Is there anything in the file /etc/isamlog?
If this file does not exist, create it with rw-rw-rw permissions. Then shared memory problems will be logged into this file the next time.
6) Becareful with isview... isview 'halts' the system when it takes it snapshot and if you do a 'isview | more' then the system will freeze until you get to the end of the 'more'. This might have only been a problem with a certain release of 4.6, I cant remember.
Rob D.
Just a few questions...
1) When you upgraded the box, I guess you also upgraded or installed the O/S??
2) What sort of user level do you have logged on when you get the problem?
3) Do you have many ProISAM files?
4) If you do an isview, how many lines are in the output?
5) Is there anything in the file /etc/isamlog?
If this file does not exist, create it with rw-rw-rw permissions. Then shared memory problems will be logged into this file the next time.
6) Becareful with isview... isview 'halts' the system when it takes it snapshot and if you do a 'isview | more' then the system will freeze until you get to the end of the 'more'. This might have only been a problem with a certain release of 4.6, I cant remember.
Rob D.
#3
Posted 22 December 2003 - 10:53 AM
I know that probably the exact same problem has occurred with a large installation of a ProIV app running on Oracle.
As it appears in your case, only the bootstrap was in ProISAM but when they upgraded the hardware the performance fell off a cliff. I wasn't actually involved so I don't have any data but I'm pretty sure ProIV support resolved the problem very quickly and I'm almost sure it WAS a case of tuning the ProISAM spinlocks. Not sure if that hardware was Sun or HP.
Speculating generally, a fairly common problem with d-i-y mutexes/spinlocks is "priority inversion" where sensitivity to the "performance balance" causes processor time not to be allocated to the processes that are blocking progress..
HTH.
As it appears in your case, only the bootstrap was in ProISAM but when they upgraded the hardware the performance fell off a cliff. I wasn't actually involved so I don't have any data but I'm pretty sure ProIV support resolved the problem very quickly and I'm almost sure it WAS a case of tuning the ProISAM spinlocks. Not sure if that hardware was Sun or HP.
Speculating generally, a fairly common problem with d-i-y mutexes/spinlocks is "priority inversion" where sensitivity to the "performance balance" causes processor time not to be allocated to the processes that are blocking progress..
HTH.
Edited by Richard Bassett, 22 December 2003 - 10:53 AM.
Nothing's as simple as you think
#4
Posted 22 December 2003 - 08:00 PM
Thanks for your replies.
Let me go into further details, without mentioning the application.
1) regarding the operating system:
We use Solaris EIS Standard, 5.8 Generic_108528-23 sun4u sparc SUNW,Sun-Fire on all machines.
All systems come from a common jumpstart installation source and were installed within the last few weeks.
2) User Level
If the problem appears at night (the only time I can do some testing anyway) I have hardly any
user mode. If I shutdown the application and leave oracle up, the system is 99% idle. If the proIV app
is running, I have, for exampe 2%idle, 8%user,90% kernel, 0% iowait.
at this moment the application should be running a simple batch job that runs on one cpu.
3,4) I have about 120 .pro files, the file table in isview is 32 long. the output of isview is about 1000 lines
long, that is if I allow write access to the .pro files which normally is the case.
5) I do not have an isamlog, but I have created one now. I have seen the shared memory segment extend once,
I thereafter increased it.
6) isview seems not to be blocking on my system.
ever since we are using spinlocks, the proability of one out of mind task in the applikation slowing the entire system down seems to have increased. the application consists of over a hundred tasks, most of which I cannot restart seperatly.
I have browsed through the proIV docs on the website to see if there was was a bugfix related to this in any of the later releases, but I could not find anything.
faz
Let me go into further details, without mentioning the application.
1) regarding the operating system:
We use Solaris EIS Standard, 5.8 Generic_108528-23 sun4u sparc SUNW,Sun-Fire on all machines.
All systems come from a common jumpstart installation source and were installed within the last few weeks.
2) User Level
If the problem appears at night (the only time I can do some testing anyway) I have hardly any
user mode. If I shutdown the application and leave oracle up, the system is 99% idle. If the proIV app
is running, I have, for exampe 2%idle, 8%user,90% kernel, 0% iowait.
at this moment the application should be running a simple batch job that runs on one cpu.
3,4) I have about 120 .pro files, the file table in isview is 32 long. the output of isview is about 1000 lines
long, that is if I allow write access to the .pro files which normally is the case.
5) I do not have an isamlog, but I have created one now. I have seen the shared memory segment extend once,
I thereafter increased it.
6) isview seems not to be blocking on my system.
ever since we are using spinlocks, the proability of one out of mind task in the applikation slowing the entire system down seems to have increased. the application consists of over a hundred tasks, most of which I cannot restart seperatly.
I have browsed through the proIV docs on the website to see if there was was a bugfix related to this in any of the later releases, but I could not find anything.
faz
Reply to this topic

0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users