Follow High customers process

system · November 9, 2004, 4:25pm

Hello,

I’m looking for a free lamp tool to follow up the high customer process.

I mean, I’d like to know who on the server is consuming a lot of CPU/MEMORY

Some customers have bad php or cgi programs and I’d like to know which customer and which process are higher consumer.

For example sometimes I may see in nodeworx a cpu near 100 and an average between 50-70. But the loadaverage has an average under 1 and sometimes has pick up to 10.

When it goes up to 10 I’d like to knwo who (process and account) is doing that.

Do you know a tool for this ? do I have to pgm my own tool ?

Thanks for your help and advises

system · November 15, 2004, 11:31am

I know that it is possible with kernel accounting processes like accton / as , etc …

but the pbm is that a php script is owned by php user. So you can’t know which siteworx account “eat” all you cpu/memory or loadaverage

it could be great to have this in nodeworx. Be able to know who is consoming what and how.

(ouarf not sure it is english: sorry)
Pascal

IWorx-Chris · November 24, 2004, 3:26am

If the php script is running through apache on the non-iworx part of your system you will simply see that it’s the apache user who is causing the problems. If it’s a php script run by iworx-cp it could either be the iworx user or root as iworx-cp does run some procs as root for obvious reasons. Setting up process accounting will get you the general info you need pascal and if it’s iworx then it may be the cron jobs so at least you can narrow it down ot the time of day the problems occur and work from there.

Chris

system · November 24, 2004, 8:59am

Thanks Chris.

When I compile this answer and this one http://interworx.info/forums/showthread.php?p=1740#post1740 I may conclude that you tell me that you’ll set on the process accounting + the perchild module ?

If I understand well, if I’d like to use it I have to compile Apache with the Perchild MPM. (not worker or prefork).

The pbm with perchild is he starting every virtual host in a separate instance, which is wasting memory*, right ?

Also the fact is that you need to run at least 1 process for each vhost, with a “normal” MPM like worker og prefork there will default be spawned 5 servers, using the same amount with the perchild MPM and hosting 2000 vhosts you would have 10000 process running just starting the server furthermore you need 2 * 2000 lines you userid assignment. *

= from (http://php.mirrors.powertrip.co.za/manual/fr/security.apache.php)

So, don’t really know what is the best.

About PHP there is maybe another solution with “mod_fastcgi” and php compiled with “FastCGI” mode enabled (in PLD you have apache-mod_fastcgi and php-fcgi packages with that).

but I don’t know a lot of thing about his last solution.

Ok, every comments of interworx-cp users, staff, about how to tweaks, securise and perform performance auditing with Apache, php and Mysql are welcome !!!

Anybody has already setup perchild (advantages/incovenients), FastCGI ?

Thanks
pascal

system · November 24, 2004, 9:35am

hello,

I don’t really see how to use it. For me I should create something like this :

<ifmodule mod_userdir.c>
UserDir /home/*/domaine.com/html/
</ifmodule>

but the problem is for the domain.com which is always different form one user to an other.

So the solution would be to create this in the /etc/httpd/conf.d/domaine.conf ?
And specify the domaine name and write this :

<ifmodule mod_userdir.c>
UserDir disabled
UserDir enabled siteworx_account1
UserDir /home/siteworx_account1/domaine.com/html/
</ifmodule>

right ?

Pascal

system · November 24, 2004, 1:44pm

is a iworx member may delete my post about apache tweaks as this one become a dup of this other one.

Thanks

kipper3d · November 25, 2004, 5:00pm

This is one of my biggest gripes… Hard to pinpoint the problem when the loads are overboard, who is the cause. I wish linux had better tools for this that scoring through the logs. Most of the time i just restart the service that causing the loads, but I rarely can figure out who and where I can prevent the problem from happening.

system · November 26, 2004, 8:14am

Well, I 100% agree.

For me the process whom takes the most loadaverage is mysqld/iworx-db and/or Php.

I have a 3.00Ghz box with 2GB ram memory.
My CPU never goes after the 35% cpu use line, and my swap is no more used, but sometimes I have a load average up to 5-7. When I look at top, I see that it occurs only with MySQLd and PHP.

A solution could be to read all the customers php logs which is not a realistic solution as I have more than 50 customers on this box.

Tweaks and enable traces or accounting process for php/mysql and Apache seems to be the only realistic solution.

Maybe some of us, could have more informations about this. I’m OK to perform some test as soon as I’d get a new box.

Pascal

system · November 27, 2004, 7:28am

Hello,

I’m very suprised.

My CPu never goes over 40%, my free memory is 1GB ram (/2GB with 0 swap) but as soon as there is :
1-2 mysqld +
1-2 mysql-safe +/or
1-2 iworx-web or iworw-db

my load average jump up to 5 and the perfomance of my box become very bad.

About your compress/crypt tool (lol don’t remember the name just now, not zend but the other one), do you reload it at every pages load or only one time ?

Do you have any idea why with only 6 processes running : 1 mysql + 2 mysql-safe + 1 iwork-db + 1 iworx-web + 1 httpd : my sever jump to a loadaverage = 5-7 and become very very slow ???

thanks for your help/advises

pascal

IWorx-Chris · November 27, 2004, 1:32pm

It could be a few things Pascal but from the sounds it’s possibly a I/O and/or disk problem. We’ve had a few boxes that even after tweaking with hdparm have had huge increases in responsiveness.

Try a:


hdparm -Tt /dev/hda

Where /dev/hda is the first IDE disk (change to match your config if you’re running RAID or SCSI disks).

Also, I can peek around on your box as well if you want, just drop the root info in the secure root drop area: https://secure.interworx.info/iworx-cp/support/rootdrop.php

Chris

system · November 28, 2004, 3:55am

so funny Chris !!!

Do you remember two days ago I had to change my HDD because it was failing.

In fact I done a HDPARM -i and after a HDPARM -m16 and after I didn’t have any more pbm…

Continue with this I tweaked my 2 HDDs with

Hdparm -tT = 3.0

after my tweaks = 6

here is mt tweaks

hdparm -m16 -c3

I can’t do more as after the box freezed. (like for example hdparm -X64 -u1 -d1 -c3 -m16)

Thanks to have a look, as since two days it is very slow and I can’t figure out why.

Pascal

system · November 28, 2004, 4:33am

It’s really strange.

I’ve done a lot of test and every thing seems to be ok.

Sql, network trace route, …

And even if my loadavergae is under one some times web pages take 5-6 seconds to show (before it was not) and some time I even get a time out (loadaverage 0.4 , cpu 30%) !!!

Thanks Chris for your help, I really apreciate it.
Pascal

system · November 28, 2004, 6:58am

Hello,

In think my pbm may come from the HDDs.

Here is for example a vmstat when everything are going well


procs                      memory      swap          io     system         cpu
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy wa id
 1  0   1016 100940 157248 1327664    0    0    27   249  235    77 16  7  3 73
 0  0   1016 100980 157248 1327680    0    0     0   137  243   153  1  1  0 98
 0  0   1016 100808 157248 1327684    0    0     0   101  230   163  7  0  0 93
 0  0   1016 100448 157248 1327688    0    0     0   139  260   160  3  1  0 96
 1  0   1016 100364 157248 1327696    0    0     0    48  189   103  6  0  0 94
 0  0   1016 100364 157248 1327704    0    0     0    58  180    97  4  0  0 96
 0  0   1016 100348 157248 1327708    0    0     0   162  232   155  4  1  0 95
 0  0   1016 100348 157248 1327712    0    0     0   119  184    78  0  0  0 100
 2  0   1016  99528 157248 1327720    0    0     0   125  215   181 11  1  0 88
 0  0   1016  99336 157248 1327740    0    0     0   107  160   122  2  1  0 96
 0  0   1016  99336 157248 1327740    0    0     0    59  143    46  0  0  0 100
 1  0   1016  99204 157256 1327744    0    0     1    98  195   267 19  0  0 81
 0  0   1016  98948 157256 1327780    0    0     6   290  245   165 10  1  0 88
 0  0   1016  98964 157256 1327800    0    0     1   149  218   125  5  1  0 94
 0  0   1016  98964 157256 1327808    0    0     0    94  177   103  5  1  0 94
 0  0   1016  98964 157256 1327812    0    0     0    96  158    52  0  0  0 99
 0  0   1016  98964 157256 1327816    0    0     0    57  157    66  1  0  0 99
 0  0   1016  98948 157256 1327816    0    0     0    18  144    70  0  0  0 100
 0  0   1016  98964 157256 1327856    0    0     7   163  173    85  3  1  0 96
 0  0   1016  98964 157256 1327860    0    0     0    39  116    39  1  0  0 99

And the same vmstat when I have perf pbm (loadaverage up to 5-7)


procs                      memory      swap          io     system         cpu
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy wa id
 2  0   1016  61860 157260 1328472    0    0    27   249  235    79 16  7  3 73
 0  1   1016  63044 157260 1328480    0    0     0  1370  348   643 29 14 58  0
 0  1   1016  62040 157260 1328480    0    0     0   743  328   128  5  1 94  0
 2  0   1016  61976 157260 1328484    0    0     0   765  301   312  5  2 93  0
 1  0   1016  56592 157264 1328520    0    0     0   275  179  2354 59 41  0  0
 1  0   1016  51832 157264 1328568    0    0     0   448  231  2359 54 46  0  0
 0  0   1016  95924 157264 1328624    0    0     6   653  311   773 36 19  1 44
 0  0   1016  95924 157264 1328632    0    0     0   130  218   155  3  0  0 96
 0  0   1016  95924 157264 1328632    0    0     0   101  175   123  6  1  0 93
 0  0   1016  95924 157264 1328632    0    0     0    35  152    90  2  1  0 97
 0  0   1016  95924 157264 1328632    0    0     0    50  139    59  1  0  0 99
 0  0   1016  95924 157264 1328644    0    0     0   266  210    95  2  0 14 83
 1  0   1016  95924 157264 1328644    0    0     0   264  203   203 18  2 11 69
 0  0   1016  95924 157264 1328652    0    0     0   373  259    98  3  0 18 78
 0  0   1016  95688 157264 1328660    0    0     0    92  216   188  4  1  0 95
 0  0   1016  95688 157264 1328676    0    0     0   115  217   184  4  2  0 94
 0  0   1016  95688 157264 1328680    0    0     0    90  168    90  4  0  0 96
 0  0   1016  95688 157264 1328680    0    0     0    71  169    94  2  1  0 97
 0  0   1016  95680 157264 1328696    0    0     0    66  176   114  2  0  0 98
 1  0   1016  95688 157264 1328700    0    0     0    87  150    62  0  0  0 100

We may see that there is a high percentage of processes going into an i/o wait states.
No swap (it’s fine)

Pascal

IWorx-Chris · November 30, 2004, 5:10pm

Pascal, was this the drive that recently died? or your new box?

Chris