Apache crashes then will not restart

billmalarky · June 22, 2011, 1:39am

Recently my apache has been crashing every night and will not restart.

The error it throws is:

httpd not running, trying to start
(98)Address already in use: make_sock: could not bind to address [::]:80
(98)Address already in use: make_sock: could not bind to address 0.0.0.0:80
no listening sockets available, shutting down
Unable to open logs

There are some stale processes that are not allowing apache to restart, I ran a ps aux | grep httpd after one crash and these are the stale processes:

apache 2072 0.2 0.0 389160 7784 ? S 00:30 0:30 /usr/sbin/httpd -DSSL
root 2627 0.0 0.0 84000 2832 ? Ss Jun21 0:00 /home/interworx/bin/iworx-web -f /home/interworx/etc/httpd/httpd.conf -DSSL
apache 7844 0.0 0.0 389384 7960 ? S Jun21 0:30 /usr/sbin/httpd -DSSL
251 12061 0.0 0.0 84224 2892 ? S Jun21 0:00 /home/interworx/bin/iworx-web -f /home/interworx/etc/httpd/httpd.conf -DSSL
251 12082 0.0 0.0 84224 2892 ? S Jun21 0:00 /home/interworx/bin/iworx-web -f /home/interworx/etc/httpd/httpd.conf -DSSL
root 20109 0.0 0.0 61216 772 pts/0 S+ 03:34 0:00 grep httpd
apache 27437 0.0 0.0 389144 7808 ? S Jun21 0:30 /usr/sbin/httpd -DSSL

rebooting the entire server allows apache to restart, but this means my server is down for about an hour a day since I have to wake up in teh middle of the night to restart it.

Any ideas what’s going on here?

billmalarky · June 22, 2011, 2:01am

Update

Apparently whenever I do an apachectl restart or a apachectl stop then apachectl start everything starts up fine.

It only doesn’t restart when it crashes every night.

IWorx-Robert · June 22, 2011, 5:40am

That’s interesting - Is the timing of the crash consistent? Does it, say, fall over dead at 0334hr every morning?

billmalarky · June 22, 2011, 2:23pm

No the timing is not consistent. It always happens between 12:00am and 7:00am though. If it crashes again tonight I will run a netstat -lnp to see what program is running on port 80… Any ideas Robert?

IWorx-Dan · June 22, 2011, 2:48pm

iworx-web is the InterWorx control panel webserver and runs on 2080 and port 2443. It should not have any effect on the standard 80/443 apache webserver.

The only other time I’ve seen something like this is that A) there are still apache processes running that wont die or B) the server’s been compromised and apache has been killed and a fake httpd process have been spawned in place to allow an attacker to interface with the server.

I’d recommend running nestat next time this happens to see what process is on that port.

EDIT: Also look for mischievous cron jobs if it’s occurring consistently around the same time.

billmalarky · June 22, 2011, 4:40pm

A) there are still apache processes running that wont die

How would I check against this?

B) the server’s been compromised and apache has been killed and a fake httpd process have been spawned in place to allow an attacker to interface with the server.

How would I go about figuring. out if this is the case?

Also, how do I check the cron jobs? I’ll post the netstat results if it occurs again tonight.

IWorx-Dan · June 22, 2011, 5:08pm

Well I’d probably

strace -f -p [PID] 2>&1

the process after issuing the “service httpd stop” command to see what they were doing - if they were actually stuck or if there was a lot of activity.

I might also check if that /usr/sbin/httpd has been modified at all by using

rpm -V httpd

If nothing is conclusive is found, try force-stopping the process with killall /usr/sbin/httpd. If that doesn’t work use the -9 switch to force-kill.

If you are killing the processes but new ones get respawned, there’s a pretty good chance something fishy is going on. The one instance I saw, the httpd process would be respawned very quickly after it was killed.

Cron jobs can be checked by looking in the /etc/cron* folders for files that look out of place (i.e. no application you know of would need them. Cron jobs are also checked by looking at the crontabs with

crontab -l -u [user]

. So I’d check root, apache, and iworx to see if any fishy cronjobs are there.

Apparently whenever I do an apachectl restart or a apachectl stop then apachectl start everything starts up fine.

It only doesn’t restart when it crashes every night.

Does this means when you issue a restart at night - you can’t get it to come back online? Have you tried the other methods of starting/stopping/restarting? (service httpd restart or /etc/init.d/httpd restart)

billmalarky · June 22, 2011, 8:12pm

Regarding this statement, what does this do? Should I run this after apache crashes? Wha exactly does it do and what should I be looking for? Do I’m guessing you want me to put the PID of the stale processes in where you have [PID] written. What is the purpose of the 2>&1?

strace -f -p [PID] 2>&1

Regarding “rpm -V httpd”, I ran it and it crashed apache (similar to what happens at night I suppose because the same processes were stale).

Here is the output:

[root@server linux_workspace]# rpm -V httpd
S.5…T c /etc/httpd/conf/httpd.conf

After apache had crashed I ran the following for you to take a look at:

[root@server linux_workspace]# ps aux | grep httpd
root 2628 0.0 0.0 84000 2836 ? Ss 03:38 0:00 /home/interworx/bin/iworx-web -f /home/interworx/etc/httpd/httpd.conf -DSSL
251 3700 0.0 0.0 85012 4212 ? S 03:40 0:00 /home/interworx/bin/iworx-web -f /home/interworx/etc/httpd/httpd.conf -DSSL
apache 9905 0.0 0.0 389024 7728 ? S 11:26 0:30 /usr/sbin/httpd -k start
apache 16473 0.1 0.0 389044 7832 ? S 18:14 0:30 /usr/sbin/httpd -k start
apache 24828 0.0 0.0 389064 7688 ? S 13:43 0:30 /usr/sbin/httpd -k start
251 28269 0.0 0.0 84224 2892 ? S 22:16 0:00 /home/interworx/bin/iworx-web -f /home/interworx/etc/httpd/httpd.conf -DSSL
root 29147 0.0 0.0 61216 776 pts/0 S+ 22:25 0:00 grep httpd

[root@server linux_workspace]# Active Internet connections Proto Recv-Q Send-Q Local Address tcp 0 0 0.0.0.0:2306 tcp 0 0 0.0.0.0:3306 tcp 0 0 127.0.0.1:3310 tcp 0 0 127.0.0.1:783 tcp 0 0 172.16.53.66:53 tcp 0 tcp 0 0 :::2080 tcp 0 0 :::993 tcp 0 0 :::995 tcp 0 0 :::2443 tcp 0 0 :::110 tcp 0 0 :::143 tcp 0 0 :::80 tcp 0 tcp 0 0 :::21 tcp 0 0 :::22 tcp 0 0 :::25 udp 0 0 172.16.53.66:123 udp 0 udp 0 0 127.0.0.1:123 udp 0 0 0.0.0.0:123 udp 0 udp 0 udp 0 udp 0 udp 0 udp 0 0 ::1:123 udp 0 0 :::123 Active UNIX domain Proto RefCnt Flags unix 2 [ ACC ] unix 2 [ ACC ] unix 2 [ ACC ] unix 2 [ ACC ] unix 2 [ ACC ] unix 2 [ ACC ] unix 2 [ ACC ] unix 2 [ ACC ] unix 2 [ ACC ] unix 2 [ ACC ] unix 2 [ ACC ] netstat -lnp
(only servers)
Foreign Address State PID/Program name
0.0.0.0:* LISTEN 2626/iworx-db
0.0.0.0:* LISTEN 2851/mysqld
0.0.0.0:* LISTEN 2763/clamd
0.0.0.0:* LISTEN 2878/spamd -d -q -x
0.0.0.0:* LISTEN 3138/tcpserver
0 173.231.136.106:53 0.0.0.0:* LISTEN 3141/tcpserver
:::* LISTEN 2628/iworx-web
:::* LISTEN 3149/tcpserver
:::* LISTEN 3143/tcpserver
:::* LISTEN 2628/iworx-web
:::* LISTEN 3144/tcpserver
:::* LISTEN 3151/tcpserver
:::* LISTEN 9905/httpd
0 ::ffff:127.0.0.1:53 :::* LISTEN 3133/dnscache
:::* LISTEN 2935/proftpd: (acce
:::* LISTEN 2740/sshd
:::* LISTEN 3153/tcpserver
0.0.0.0:* 2754/ntpd
0 173.231.136.106:123 0.0.0.0:* 2754/ntpd
0.0.0.0:* 2754/ntpd
0.0.0.0:* 2754/ntpd
0 ::ffff:127.0.0.1:53 :::* 3133/dnscache
0 ::ffff:172.16.53.66:53 :::* 3134/tinydns
0 ::ffff:173.231.136.106:53 :::* 3137/tinydns
0 fe80::225:90ff:fe35:123 :::* 2754/ntpd
0 fe80::225:90ff:fe35:123 :::* 2754/ntpd
:::* 2754/ntpd
:::* 2754/ntpd
sockets (only servers)
Type State I-Node PID/Program name Path
STREAM LISTENING 6123 2851/mysqld /var/lib/mysql/mysql.sock
STREAM LISTENING 5339 2580/dbus-daemon /var/run/dbus/system_bus_socket
STREAM LISTENING 4966 2336/iscsid @ISCSIADM_ABSTRACT_NAMESPACE
STREAM LISTENING 6504 2993/hald @/var/run/hald/dbus-cejszPoYHl
STREAM LISTENING 5774 2688/acpid /var/run/acpid.socket
STREAM LISTENING 9006 3216/gam_server @/tmp/fam-root-
STREAM LISTENING 5521 2626/iworx-db /home/interworx/var/run/mysql.sock
STREAM LISTENING 4948 2329/brcm_iscsiuio @ISCSID_UIP_ABSTRACT_NAMESPACE
STREAM LISTENING 6503 2993/hald @/var/run/hald/dbus-UWzM77wHQn
STREAM LISTENING 6269 2891/gpm /dev/gpmctl
STREAM LISTENING 6440 2963/xfs /tmp/.font-unix/fs7100

After all this I was able to reboot apache using “service httpd restart”. Here is the output of that:

[root@server linux_workspace]# service httpd restart
Stopping httpd: [FAILED]
Flushing IPC Semaphores [ OK ]
Starting httpd: [ OK ]
[root@server linux_workspace]#

Regarding cronjobs. I checked all the users (that I know about… unless the box has been compromised and another user created) and nothing seemed out of the ordinary.

Here are the contents of cron.daily:
0anacron 0logwatch freshclam logrotate makewhatis.cron mlocate.cron prelink rpm

cron.hourly just had the mcelog.cron.

Lastly, when you said the following:

Does this means when you issue a restart at night - you can’t get it to come back online?

Yes, when it crashes at niight I have to do a reboot to get it to come back online (I could probably kill the stale processes but rebooting is simpler for someone less technically skilled like I am.)

The main two questions are:

What the heck is crashing apache?
Why are these processes not stopping when apache crashes?

Thank you Dan & Robert, you have no idea how much you help is appreciated (actually, you probably do).

billmalarky · June 22, 2011, 8:16pm

Also, quick question, was “rpm -V httpd” supposed to crash apache?

IWorx-Dan · June 23, 2011, 2:53pm

Well strace normally dumps output to stderr - or the “error” stream of unix so the 2>&1 tells the shell to redirect stderr to stdout, where normal output is sent. Honestly stracing will probably provide little insight unless you’ve dealt with that utility before. Yes, I’d use the utility on a stale process id after a crash. Typically I just like to see if it’s spewing lots of stuff out (i.e. the stale processes are actually doing things) or if it’s just hanging on a poll( or read( function. You can alternatively replace the 2>&1 with just a 2> [filename] to redirect the output to file for later review.

rpm -V only checks to see if the files from a given package have been modified or changed in anyway. -V is the ‘verify’ switch. According to the output you gave it doesn’t look like this is the case. I’d say your probably OK.

Check /var/log/httpd/access_log and /var/log/http/error_log to see if there’s any output there at the end of the file that could indicate what the issue is. The best time would be immediately after a crash before the web server is restarted.

Also /var/log/messages might have some output if there’s segfaulting going on.

Lastly, is the only symptom of the crash that the content on hosted on the server is inaccessible? Is the control panel still accessible?

billmalarky · June 23, 2011, 2:59pm

Daniel:

The server did not crash last night. The next time it does I will make a copy of as well as check /var/log/httpd/access_log, /var/log/http/error_log, and /var/log/messages, and report back to you.

I do not know if control panel is still accessible, it never dawned on my to check that so I will verify this as well next time it crashes.

Thanks.

billmalarky · June 23, 2011, 11:13pm

Dan apache just crashed around 12:32am. I ran the diagnostics again this time and here is what was reported.

[root@server ~]# ps aux |grep httpd
apache 1090 0.0 0.0 389160 7852 ? S Jun23 0:30 /usr/sbin/httpd -DSSL
root 2628 0.0 0.0 84000 2836 ? Ss Jun22 0:00 /home/interworx /bin/iworx-web -f /home/interworx/etc/httpd/httpd.conf -DSSL
apache 3172 0.0 0.0 389160 7808 ? S Jun23 0:30 /usr/sbin/httpd -DSSL
apache 4846 0.0 0.0 389136 7848 ? S Jun23 0:30 /usr/sbin/httpd -DSSL
root 31222 0.0 0.0 61216 776 pts/0 S+ 01:41 0:00 grep httpd
251 32590 0.0 0.0 84224 2892 ? S Jun23 0:00 /home/interworx /bin/iworx-web -f /home/interworx/etc/httpd/httpd.conf -DSSL
251 32621 0.0 0.0 84224 2892 ? S Jun23 0:00 /home/interworx /bin/iworx-web -f /home/interworx/etc/httpd/httpd.conf -DSSL
[root@server ~]#

[root@server ~]# netstat -lnp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:2306 0.0.0.0:* LISTEN 2626/iworx-db
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 2851/mysqld
tcp 0 0 127.0.0.1:3310 0.0.0.0:* LISTEN 2763/clamd
tcp 0 0 127.0.0.1:783 0.0.0.0:* LISTEN 2878/spamd -d -q -x
tcp 0 0 172.16.53.66:53 0.0.0.0:* LISTEN 3138/tcpserver
tcp 0 0 173.231.136.106:53 0.0.0.0:* LISTEN 3141/tcpserver
tcp 0 0 :::2080 :::* LISTEN 2628/iworx-web
tcp 0 0 :::993 :::* LISTEN 3149/tcpserver
tcp 0 0 :::995 :::* LISTEN 3143/tcpserver
tcp 0 0 :::2443 :::* LISTEN 2628/iworx-web
tcp 0 0 :::110 :::* LISTEN 3144/tcpserver
tcp 0 0 :::143 :::* LISTEN 3151/tcpserver
tcp 0 0 :::80 :::* LISTEN 1090/httpd
tcp 0 0 ::ffff:127.0.0.1:53 :::* LISTEN 3133/dnscache
tcp 0 0 :::21 :::* LISTEN 2935/proftpd: (acce
tcp 0 0 :::22 :::* LISTEN 2740/sshd
tcp 0 0 :::25 :::* LISTEN 3153/tcpserver
tcp 0 0 :::443 :::* LISTEN 1090/httpd
udp 0 0 172.16.53.66:123 0.0.0.0:* 2754/ntpd
udp 0 0 173.231.136.106:123 0.0.0.0:* 2754/ntpd
udp 0 0 127.0.0.1:123 0.0.0.0:* 2754/ntpd
udp 0 0 0.0.0.0:123 0.0.0.0:* 2754/ntpd
udp 0 0 ::ffff:127.0.0.1:53 :::* 3133/dnscache
udp 0 0 ::ffff:172.16.53.66:53 :::* 3134/tinydns
udp 0 0 ::ffff:173.231.136.106:53 :::* 3137/tinydns
udp 0 0 fe80::225:90ff:fe35:123 :::* 2754/ntpd
udp 0 0 fe80::225:90ff:fe35:123 :::* 2754/ntpd
udp 0 0 ::1:123 :::* 2754/ntpd
udp 0 0 :::123 :::* 2754/ntpd
Active UNIX domain sockets (only servers)
Proto RefCnt Flags Type State I-Node PID/Program name Path
unix 2 [ ACC ] STREAM LISTENING 6123 2851/mysqld /var/lib/mysql/mysql.sock
unix 2 [ ACC ] STREAM LISTENING 5339 2580/dbus-daemon /var/run/dbus/system_bus_socket
unix 2 [ ACC ] STREAM LISTENING 4966 2336/iscsid @ISCSIADM_ABSTRACT_NAMESPACE
unix 2 [ ACC ] STREAM LISTENING 6504 2993/hald @/var/run/hald/dbus-cejszPoYHl
unix 2 [ ACC ] STREAM LISTENING 5774 2688/acpid /var/run/acpid.socket
unix 2 [ ACC ] STREAM LISTENING 9006 3216/gam_server @/tmp/fam-root-
unix 2 [ ACC ] STREAM LISTENING 5521 2626/iworx-db /home/interworx/var/run/mysql.sock
unix 2 [ ACC ] STREAM LISTENING 4948 2329/brcm_iscsiuio @ISCSID_UIP_ABSTRACT_NAMESPACE
unix 2 [ ACC ] STREAM LISTENING 6503 2993/hald @/var/run/hald/dbus-UWzM77wHQn
unix 2 [ ACC ] STREAM LISTENING 6269 2891/gpm /dev/gpmctl
unix 2 [ ACC ] STREAM LISTENING 6440 2963/xfs /tmp/.font-unix/fs7100
[root@server ~]#

I was able to go into https://exampledomain.com:2443/siteworx/ after the crash. Does interworx run seperate from apache? I guess that would explain why these (in bold) processes are still running after apache crashes if so:

[root@server ~]# ps aux |grep httpd
apache 1090 0.0 0.0 389160 7852 ? S Jun23 0:30 /usr/sbin/httpd -DSSL
root 2628 0.0 0.0 84000 2836 ? Ss Jun22 0:00 /home/interworx /bin/iworx-web -f /home/interworx/etc/httpd/httpd.conf -DSSL
apache 3172 0.0 0.0 389160 7808 ? S Jun23 0:30 /usr/sbin/httpd -DSSL
apache 4846 0.0 0.0 389136 7848 ? S Jun23 0:30 /usr/sbin/httpd -DSSL
root 31222 0.0 0.0 61216 776 pts/0 S+ 01:41 0:00 grep httpd
251 32590 0.0 0.0 84224 2892 ? S Jun23 0:00 /home/interworx /bin/iworx-web -f /home/interworx/etc/httpd/httpd.conf -DSSL
251 32621 0.0 0.0 84224 2892 ? S Jun23 0:00 /home/interworx /bin/iworx-web -f /home/interworx/etc/httpd/httpd.conf -DSSL
[root@server ~]#

So how do I look more into the apache “1090”, “3172”, and “4846” processes? Is there a command to give me more information about them? According to netstat, it seems the 1090 process is the one sitting on port 80 right?

On to the logs, I’ll copy the results of tail here. Please let me know if you see anything fishy:

[root@server linux_workspace]# tail access_log
127.0.0.1 - - [24/Jun/2011:00:46:03 -0400] “GET /watch-flush HTTP/1.0” 200 3
127.0.0.1 - - [24/Jun/2011:00:51:04 -0400] “GET /watch-flush HTTP/1.0” 200 3
127.0.0.1 - - [24/Jun/2011:00:56:03 -0400] “GET /watch-flush HTTP/1.0” 200 3
127.0.0.1 - - [24/Jun/2011:01:01:02 -0400] “GET /watch-flush HTTP/1.0” 200 3
127.0.0.1 - - [24/Jun/2011:01:06:03 -0400] “GET /watch-flush HTTP/1.0” 200 3
127.0.0.1 - - [24/Jun/2011:01:11:02 -0400] “GET /watch-flush HTTP/1.0” 200 3
127.0.0.1 - - [24/Jun/2011:01:16:03 -0400] “GET /watch-flush HTTP/1.0” 200 3
127.0.0.1 - - [24/Jun/2011:01:21:03 -0400] “GET /watch-flush HTTP/1.0” 200 3
127.0.0.1 - - [24/Jun/2011:01:26:03 -0400] “GET /watch-flush HTTP/1.0” 200 3
127.0.0.1 - - [24/Jun/2011:01:31:03 -0400] “GET /watch-flush HTTP/1.0” 200 3

[root@server linux_workspace]# tail error_log
[Wed Jun 22 22:29:57 2011] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Wed Jun 22 22:29:57 2011] [warn] RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?)
[Wed Jun 22 22:29:57 2011] [warn] RSA server certificate CommonName (CN) server.mydomain.com' does NOT match server name!? [Wed Jun 22 22:29:58 2011] [warn] RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?) [Wed Jun 22 22:29:58 2011] [warn] RSA server certificate CommonName (CN) server.mydomain.com’ does NOT match server name!?
[Wed Jun 22 22:29:58 2011] [notice] Apache/2.2.17 (Unix) DAV/2 PHP/5.3.3 mod_ssl/2.2.17 OpenSSL/0.9.8e-fips-rhel5 mod_watch/4.3 configured – resuming normal operations
[Thu Jun 23 22:52:43 2011] [error] [client 81.218.234.8] File does not exist: /var/www/htdocs/admin
[Fri Jun 24 01:34:57 2011] [warn] child process 1090 still did not exit, sending a SIGTERM
[Fri Jun 24 01:34:57 2011] [warn] child process 4846 still did not exit, sending a SIGTERM
[Fri Jun 24 01:34:57 2011] [warn] child process 3172 still did not exit, sending a SIGTERM

[root@server linux_workspace]# tail messages
Jun 24 00:47:01 server clamd[2763]: SelfCheck: Database status OK.
Jun 24 00:57:01 server clamd[2763]: SelfCheck: Database status OK.
Jun 24 01:02:46 server proftpd[30733]: 127.0.1.1 (::ffff:88.74.204.155[::ffff:88.74.204.155]) - FTP session opened.
Jun 24 01:02:46 server proftpd[30733]: 127.0.1.1 (::ffff:88.74.204.155[::ffff:88.74.204.155]) - FTP session closed.
Jun 24 01:07:01 server clamd[2763]: SelfCheck: Database status OK.
Jun 24 01:17:01 server clamd[2763]: SelfCheck: Database status OK.
Jun 24 01:27:01 server clamd[2763]: SelfCheck: Database status OK.
Jun 24 01:37:01 server clamd[2763]: SelfCheck: Database status OK.
Jun 24 01:41:39 server clamd[2763]: Reading databases from /var/lib/clamav
Jun 24 01:41:42 server clamd[2763]: Database correctly reloaded (975772 signatures)

This seems interesting, my clamd is recently out of date. Do you think the crash is being caused by clamd trying to update itself or something now that it is out of date???

Thanks for you help Dan (& Robert too if you’re still there!).
Thanks,
William

billmalarky · June 23, 2011, 11:23pm

Also, I was able to restart the apache service again by using #service httpd restart.

Strangely, apachectl restart does not work, and throws the following error:

[root@server ~]# apachectl restart
httpd not running, trying to start
(98)Address already in use: make_sock: could not bind to address [::]:80
(98)Address already in use: make_sock: could not bind to address 0.0.0.0:80
no listening sockets available, shutting down
Unable to open logs

billmalarky · June 24, 2011, 8:58am

Okay so I was looking through the nodeworx settings and I saw that apache auto-restart was set to “no”. I updated this and set it to “yes”. Will this run an #service httpd restart or an #apachectl restart. Since for some reason the former works and the latter does not for me.

IWorx-Dan · June 24, 2011, 1:33pm

service httpd [start|stop|restart|status] is the method I use and I believe InterWorx uses to restart the web sever. I believe all it does is calls the /etc/init.d/httpd script. apachectl is probably not the preferred way to restart the server in the CentOS environment. It may be trying to spawn a new instance of apache before ensuring the existing processes are indeed dead. That’s why you are getting the error that :80 is already bound. If you open apachectl you can see it just calls /usr/sbin/httpd -k restart which would probably not check for existing alive children.

I do not believe the crash is related to clamav.

I am rereading this thread and looking for if you mentioned the symptom of the crash - what is the symptom? You can’t access any siteworx websites? How are you alerted that the crash has occured?

2 things that might be helpful… output of

crontab -l -u iworx

and

/var/log/cron for Jun 24 between around 1:34 (give or take 15 minutes)

also if you just enabled auto-restart there is a good chance that InterWorx will now catch when the server goes down and bring it back up.

billmalarky · June 27, 2011, 7:35am

Dan:

Well some good news, and some… neutral news? Bad news?

The good news is that the change I made in the interface seems to be bringing the site back up immediately after any crash instead of apache just locking and my website being down until I manually go in and restart it. The bad news is, I didn’t really fix the problem. But honestly I’m sleeping better at night now (literally, I was sleeping next to my laptop before and waking up several times per night to check the website…). Honestly it was very similar to this experience steve huffman on lessons learned at reddit

How are you alerted that the crash has occured?
:

When apache crashed the my website went down and would stay down until I SSH’d into the server and rebooted the server (until you told me how to reset apache with service httpd reset). So that’s how I knew it was crashing.

Your requested feedback:

[root@server ~]# crontab -l -u iworx
SHELL=/bin/bash
PATH=/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/home/interworx/bin
MAILTO=“root”
16,21,26,31,36,41,46,51,56,1,6,11 * * * * cd /home/interworx/cron ; ./iworx.pex --fively
38,53,8,23 * * * * cd /home/interworx/cron ; ./iworx.pex --fifteenly
47 * * * * cd /home/interworx/cron ; ./iworx.pex --hourly
40 10,16,22,4 * * * cd /home/interworx/cron ; ./iworx.pex --quad_daily
56 16 * * * cd /home/interworx/cron ; ./iworx.pex --daily
28 19 * * 0 cd /home/interworx/cron ; ./iworx.pex --weekly
39 7 5 * * cd /home/interworx/cron ; ./iworx.pex --monthly
0 7 * * * /home/interworx/bin/backup.pex --domain mydomain.com --backup-options all --email mydomain@gmail.com --quiet

looking at var/log/cron it seems like the file is pretty incomplete. Perhaps some setting is screwy?

[root@server log]# less cron
Jun 26 04:05:01 server crond[20120]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 04:06:01 server crond[20559]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 04:08:01 server crond[21698]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fifteenly)
Jun 26 04:10:01 server crond[22394]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 04:11:01 server crond[22893]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 04:15:01 server crond[24810]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 04:16:01 server crond[25400]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 04:20:01 server crond[27262]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 04:21:01 server crond[27783]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 04:22:01 server crond[28512]: (root) CMD (run-parts /etc/cron.weekly)
Jun 26 04:22:01 server anacron[28516]: Updated timestamp for job `cron.weekly’ to 2011-06-26
Jun 26 04:23:01 server crond[9421]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fifteenly)
Jun 26 04:25:01 server crond[10304]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 04:26:01 server crond[10904]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 04:30:01 server crond[13122]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 04:31:01 server crond[13713]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 04:35:01 server crond[16187]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 04:36:01 server crond[16821]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 04:38:01 server crond[18177]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fifteenly)
Jun 26 04:40:01 server crond[19199]: (root) CMD (/home/vpopmail/bin/clearopensmtp > /dev/null 2>&1)
Jun 26 04:40:01 server crond[19200]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 04:40:01 server crond[19206]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --quad_daily)
Jun 26 04:41:01 server crond[19862]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 04:45:01 server crond[21975]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 04:46:01 server crond[22539]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 04:47:01 server crond[23242]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --hourly)
Jun 26 04:50:01 server crond[24322]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 04:51:01 server crond[24791]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 04:53:01 server crond[25897]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fifteenly)
Jun 26 04:55:01 server crond[26742]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 04:56:01 server crond[27324]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 05:00:01 server crond[29446]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 05:01:01 server crond[30002]: (root) CMD (run-parts /etc/cron.hourly)
Jun 26 05:01:01 server crond[30003]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 05:05:01 server crond[31751]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 05:06:01 server crond[32192]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 05:08:01 server crond[760]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fifteenly)
Jun 26 05:10:01 server crond[1528]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 05:11:01 server crond[2048]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 05:15:01 server crond[4047]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 05:16:01 server crond[4576]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)
Jun 26 05:20:01 server crond[6380]: (root) CMD (/usr/local/sim/sim -q > /dev/null 2>&1)
Jun 26 05:21:01 server crond[6865]: (iworx) CMD (cd /home/interworx/cron ; ./iworx.pex --fively)

Poooh · October 7, 2011, 9:54am

Hi,

We seem to be having the same intermitent problems.

/tmp and /var/tmp are empty and nothing seems to be out of the normal.

How did this other case end?

Regards,

billmalarky · October 9, 2011, 11:35am

Poooh, Unfortunately I can’t really help you much. My problem seemed to have solved itself. As far as apache crashing and not restarting, I didn’t have the auto-restart function turned on in interworx.