We are testing ProIV on an IBM system P AIX machine. Brief specs are:
JS22 blade
16GB memory
4x4GHz Power6 CPU
We are testing using a sales report that is run with a transparent login and timed until completion. On the initial tests (no load on the current system or the AIX), the real time to run was double that of our current system (~40 minutes compared to ~20).
The results for a test I just ran (normal daily load on linux and no load on AIX) :
Linux (Intel):
real 37m9.482s
user 8m34.950s
sys 3m44.920s
AIX System P:
real 45m26.88s
user 33m14.94s
sys 10m33.55s
There was very little kernel tuning done on our current linux system (just kernel.shmmax = 2147483648), none on the AIX, and the isamdef settings are the same on both systems.
Spinlocks are turned on for both systems
Accessing Pro-isam files only.
Are there any configuration or installation settings that can be changed to correct this. It looks like the AIX system should be much faster, but it is doing 4x more work.
We would like to purchase a new server, with speed speed speed, but due to an IBM San need to stay with IBM.
Any comments would be appreciated.
Archie

Performance on a IBM system P AIX
Started by Archie77, Feb 12 2009 03:00 PM
9 replies to this topic
#5
Posted 13 February 2009 - 09:49 AM
Hi Archie,
My suspicion would be you're right that the problem is likely to be something to do with ProISAM.
In the past I did observe the following on a high-powered Linux server with quad-core intel chips:
(1) A single process could regen the entire 3500-function application in 90 seconds. (This of course involved reading practically every record from the Pro-ISAM bootstrap files).
(2) When we tried to bring up the application (launch 100+ background processes simultaneously) the system ground to a halt, was using very little CPU and took many minutes to be 'ready for input'. (This involved 100+ processes each concurrently reading a few functdef/genfile/vardef records in order to run their first functions - far less total ProISAM activity than case 1)
This system used ProISAM *only* for the ProIV bootstrap - the application data was all in Oracle.
We tried a bunch of different ProISAM configurations but we never pursued the problem to the bitter end because it didn't show up on the customer's machine or other test machines. We strongly suspected it was something to do with an interaction between ProISAM's semaphore processing and the configuration of that machine though.
You might want to look at this old thread in case you can engineer some way of doing a comparative test with read-only files:
http://www.proivrc.c...?showtopic=1211
My suspicion would be you're right that the problem is likely to be something to do with ProISAM.
In the past I did observe the following on a high-powered Linux server with quad-core intel chips:
(1) A single process could regen the entire 3500-function application in 90 seconds. (This of course involved reading practically every record from the Pro-ISAM bootstrap files).
(2) When we tried to bring up the application (launch 100+ background processes simultaneously) the system ground to a halt, was using very little CPU and took many minutes to be 'ready for input'. (This involved 100+ processes each concurrently reading a few functdef/genfile/vardef records in order to run their first functions - far less total ProISAM activity than case 1)
This system used ProISAM *only* for the ProIV bootstrap - the application data was all in Oracle.
We tried a bunch of different ProISAM configurations but we never pursued the problem to the bitter end because it didn't show up on the customer's machine or other test machines. We strongly suspected it was something to do with an interaction between ProISAM's semaphore processing and the configuration of that machine though.
You might want to look at this old thread in case you can engineer some way of doing a comparative test with read-only files:
http://www.proivrc.c...?showtopic=1211
Nothing's as simple as you think
#9
Posted 14 February 2009 - 02:41 PM
Archie,
Sorry, may not have been paying sufficient attention..
I think I just assumed you were having a ProISAM *concurrency* problem (wouldn't be the first) whereas in fact you are talking about testing the performance of a single-threaded job with no other processes 'competing' for the ProISAM files/records, right?
If so, then in principle the number of processors and number of cores doesn't really matter (stop me if I'm still misunderstanding..)
Obviously we don't know how much distinct data your test chews through but if it's way more than the L2 cache available to one core on the chips (I'm guessing it is?) then that alone might explain the difference in some circumstances - I'm unfamiliar with Power6 for sure but for example the Intel chips might have 12Mb L2 cache shared between all cores whereas the IBM chips might have smaller caches with more 'core affinity'.
Even things like the level of code optimization applied to the compilation of the ProIV kernel itself can make a big difference (for example if the highest levels of optimization couldn't be used with one or other chip for some reason, or even that the most-optimal compiler couldn't be used)
I *have* seen stunning single-threaded CPU performance with ProIV on the Intel quad-core chips, see:
http://www.proivrc.c...?showtopic=3860
So, it is possible the Intel machine actually is faster *for this job* - it'd be interesting to know what makes you say "It looks like the IBM system should be much faster" in case there's some further insight there.
Of course it can be very hard to compare the performance of RISC and CISC chips (although, again, I'm not sure how 'RISCy' the Power chips really are nowadays).
Sorry, may not have been paying sufficient attention..
I think I just assumed you were having a ProISAM *concurrency* problem (wouldn't be the first) whereas in fact you are talking about testing the performance of a single-threaded job with no other processes 'competing' for the ProISAM files/records, right?
If so, then in principle the number of processors and number of cores doesn't really matter (stop me if I'm still misunderstanding..)
Obviously we don't know how much distinct data your test chews through but if it's way more than the L2 cache available to one core on the chips (I'm guessing it is?) then that alone might explain the difference in some circumstances - I'm unfamiliar with Power6 for sure but for example the Intel chips might have 12Mb L2 cache shared between all cores whereas the IBM chips might have smaller caches with more 'core affinity'.
Even things like the level of code optimization applied to the compilation of the ProIV kernel itself can make a big difference (for example if the highest levels of optimization couldn't be used with one or other chip for some reason, or even that the most-optimal compiler couldn't be used)
I *have* seen stunning single-threaded CPU performance with ProIV on the Intel quad-core chips, see:
http://www.proivrc.c...?showtopic=3860
So, it is possible the Intel machine actually is faster *for this job* - it'd be interesting to know what makes you say "It looks like the IBM system should be much faster" in case there's some further insight there.
Of course it can be very hard to compare the performance of RISC and CISC chips (although, again, I'm not sure how 'RISCy' the Power chips really are nowadays).
Nothing's as simple as you think
Reply to this topic

0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users