Lack of inodes / Ghost files?

system · May 9, 2005, 7:49am

Hi there,

I have a little problem with my server, I have used 93% of my inodes and 2 days ago it was 91%, still 7 days before having problems on my server ?

To know that, i did these commands:


[root@h10-10 root]# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hda1             77751492  47361156  26440760  65% /
none                    241252         0    241252   0% /dev/shm

[root@h10-10 root]# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/hda1            9879552 9129378  750174   93% /
none                   60313       1   60312    1% /dev/shm

As I host only a few websites of mine who are not getting much visitors, I don’t think that comes from the websites!
And by the way, all my websites must not take more space than 1GB or at most 2GB !!!
I think that’s coming from mails or system messages, that take a lot of place!

So how can I fix it ? I think it’s unusual for the free diskspace as for the free inodes!

Maybe you might know that a few days ago i had 0% free inodes:
I had a mailbox containing 7GB of mails (1.5 Millions of mails) because i didn’t check it for only 1 month!!
This was the email I use everywhere in interworx (server management, siteworx accounts (~50), etc…)
I fixed the problem by simply deleting the mailbox, deleting and recreating the siteworx account for that domain, and moving the mx records to another server.

Please help me on this one as I can’t launch and advertise websites until it’s fixed

Thank you very much in advance

system · May 9, 2005, 11:47am

First thing’s first: Configure and enable SpamAssassin. I too get thousands of spam messages a week.

Second, you need to find out where all of the disc space is being used otehr than email. Check the /tmp directory first and report back what you find.

system · May 9, 2005, 12:03pm

I already enabled spamassassin when i had my mail problems.
And here is the tmp:

[root@h10-10 /]# ls -g -a tmp
total 20
drwxrwxrwt 3 root 4096 May 9 21:05 .
drwxr-xr-x 23 root 4096 May 3 14:59 …
drwxrwxrwt 2 root 4096 May 3 14:59 .ICE-unix
-rw------- 1 iworx 95 May 3 21:13 sess_6daa8b4dca38cc6ab596210acf5a4c5e
-rw------- 1 iworx 95 May 5 12:07 sess_a29ff8fb9723d4acbd75382a275a731d
[root@h10-10 /]# ls -g -a tmp/.ICE-unix
total 8
drwxrwxrwt 2 root 4096 May 3 14:59 .
drwxrwxrwt 3 root 4096 May 9 21:06 …

system · May 9, 2005, 6:33pm

That looks normal.

Do you have logwatch enabled? If not you should. My next thought would be something causing your logs to grow huge. Is there anything unusual in the logs?

How is your server load?

system · May 9, 2005, 7:07pm

Have you checked your backups?

system · May 10, 2005, 4:58am

I don’t know if there is something unusual in the logs as i did never watch them… I’m not a professionnal you know
So i ran logwatch and here is the output file:

[root@h10-10 root]# logwatch --detail high --save /root/logwatchperso.log
[root@h10-10 root]# cat /root/logwatchperso.log

################### LogWatch 5.1 (02/03/04) ####################
Processing Initiated: Tue May 10 14:01:21 2005
Date Range Processed: yesterday
Detail Level of Output: 10
Logfiles for Host: h10-10
################################################################

--------------------- Cron Begin ------------------------

Commands Run:
User iworx:
cd /home/interworx/cron ; ./iworx.pex --daily: 1 Time(s)
cd /home/interworx/cron ; ./iworx.pex --fifteenly: 96 Time(s)
cd /home/interworx/cron ; ./iworx.pex --fively: 288 Time(s)
cd /home/interworx/cron ; ./iworx.pex --hourly: 24 Time(s)
cd /home/interworx/cron ; ./iworx.pex --quad_daily: 4 Time(s)
personal crontab listed: 2 Time(s)
User root:
/home/vpopmail/bin/clearopensmtp > /dev/null 2>&1: 24 Time(s)
/usr/bin/mrtg /etc/mrtg/mrtg.cfg: 288 Time(s)
run-parts /etc/cron.daily: 1 Time(s)
run-parts /etc/cron.hourly: 24 Time(s)

---------------------- Cron End -------------------------

--------------------- httpd Begin ------------------------

0.00 MB transfered in 289 responses (1xx 289, 2xx 0, 3xx 0, 4xx 0, 5xx 0)
0 Images (0 bytes),
0 Documents (0 bytes),
0 Archives (0 bytes),
0 Sound files (0 bytes),
0 Movies files (0 bytes),
0 Windows executable files (0 bytes),
0 Content pages (0 bytes),
0 Redirects (0 bytes),
0 Proxy Configuration Files (0 bytes),
0 Program source files (0 bytes),
0 CD Images (0 bytes),
289 Other (0 bytes)

A total of 1 unidentified ‘other’ records logged
with response code(s)

---------------------- httpd End -------------------------

--------------------- pam_unix Begin ------------------------

sshd:
Invalid Users:
Unknown Account: 9 Time(s)
Sessions Opened:
root: 5 Time(s)
Unknown Entries:
authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=secondavenue.plus.com : 6 Time(s)
authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=211.234.100.218 : 2 Time(s)
authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=host-58-in-132.etcbaltimore.net : 1 Time(s)

---------------------- pam_unix End -------------------------

--------------------- proftpd-messages Begin ------------------------

Unmatched Entries
h10-10 (217.44.95.110[217.44.95.110]) - FTP session idle timeout, disconnected.
h10-10 (217.44.95.110[217.44.95.110]) - FTP session idle timeout, disconnected.
h10-10 (217.44.95.110[217.44.95.110]) - FTP session idle timeout, disconnected.
h10-10 (217.44.95.110[217.44.95.110]) - FTP session idle timeout, disconnected.

---------------------- proftpd-messages End -------------------------

--------------------- Connections (secure-log) Begin ------------------------

New Users:
useradd (publidev)
useradd (miniblog)
useradd (dnsbz)
useradd (miniblog)

Deleted Users:
publidev
miniblog

New Groups:
useradd (publidev)
useradd (miniblog)
useradd (dnsbz)
useradd (miniblog)

Deleted Groups:
publidev
miniblog

---------------------- Connections (secure-log) End -------------------------

--------------------- sendmail Begin ------------------------

ERROR: Could not open /etc/mail/local-host-names

ERROR: Could not open /etc/mail/access

Message Size Distribution:
Range # Msgs KBytes
0 - 10k 0 0
10k - 20k 0 0
20k - 50k 0 0
50k - 100k 0 0
100k - 500k 0 0
500k - 1Mb 0 0
1Mb - 2Mb 0 0
2Mb - 5Mb 0 0
5Mb - 10Mb 0 0
10Mb+ 0 0

TOTAL 0 0

---------------------- sendmail End -------------------------

--------------------- SSHD Begin ------------------------

Didn’t receive an ident from these IPs:
211.234.100.218: 2 Time(s)
host-58-in-132.etcbaltimore.net (12.167.132.58): 2 Time(s)
secondavenue.plus.com (81.174.235.30): 2 Time(s)

Failed logins from these:
anonymous/password from ::ffff:81.174.235.30: 2 Time(s)
chuck/password from ::ffff:81.174.235.30: 1 Time(s)
darkman/password from ::ffff:81.174.235.30: 1 Time(s)
hostmaster/password from ::ffff:81.174.235.30: 1 Time(s)
passwd/password from ::ffff:81.174.235.30: 1 Time(s)
temp/password from ::ffff:12.167.132.58: 1 Time(s)
thomas/password from ::ffff:211.234.100.218: 2 Time(s)

Illegal users from these:
anonymous/none from ::ffff:81.174.235.30: 2 Time(s)
anonymous/password from ::ffff:81.174.235.30: 2 Time(s)
chuck/none from ::ffff:81.174.235.30: 1 Time(s)
chuck/password from ::ffff:81.174.235.30: 1 Time(s)
darkman/none from ::ffff:81.174.235.30: 1 Time(s)
darkman/password from ::ffff:81.174.235.30: 1 Time(s)
hostmaster/none from ::ffff:81.174.235.30: 1 Time(s)
hostmaster/password from ::ffff:81.174.235.30: 1 Time(s)
passwd/none from ::ffff:81.174.235.30: 1 Time(s)
passwd/password from ::ffff:81.174.235.30: 1 Time(s)
temp/none from ::ffff:12.167.132.58: 1 Time(s)
temp/password from ::ffff:12.167.132.58: 1 Time(s)
thomas/none from ::ffff:211.234.100.218: 2 Time(s)
thomas/password from ::ffff:211.234.100.218: 2 Time(s)

Users logging in through sshd:
root:
host217-44-95-110.range217-44.btcentralplus.com (217.44.95.110): 5 times

SFTP subsystem requests: 4 Time(s)

---------------------- SSHD End -------------------------

--------------------- vpopmail Begin ------------------------

No Such User Found:
bbastide - 1 Time(s)

---------------------- vpopmail End -------------------------

------------------ Disk Space --------------------

Filesystem Size Used Avail Use% Mounted on
/dev/hda1 75G 46G 25G 65% /
none 236M 0 236M 0% /dev/shm

###################### LogWatch End #########################

And about my server load here are the 3 pics i got from nodeworx:
CPU Utilization
Localhost - Load Average
Localhost Processes

Concerning the backups, i never did any backup!!
As everything on my server is the same than on my computer!!

I took a look at all these stats and logs, nothing seems wrong, doesn’t it?

Thanks very much for your help !!!

system · May 10, 2005, 5:11am

And by the way, 94% inodes used, 6 days remaining

[root@h10-10 root]# df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/hda1 9879552 9190327 689225 94% /
none 60313 1 60312 1% /dev/shm

system · May 10, 2005, 10:44am

eurfff :\

Your load average is really to high and the number of processes that increase dramaticly is strange.

What “top” gives you ?
Try to look at your log files at the date/time of the process/cpu/la pick in /var/log/messages and dmesg, and others…

What is running near 22h30 ?

I don’t know what is this

ERROR: Could not open /etc/mail/local-host-names

ERROR: Could not open /etc/mail/access

but I think you’d check it

Also look at your /var/tmp dir

Do you have some databases ? are you sure there is not one of your database that increased maybe cause a loop pbm that insert a ton of records ?

Pascal

Justec · May 10, 2005, 11:07am

Yeah, I have to agree with Pascal. The load average is pretty high. Not 100% on this but I have read the load average shouldn’t be higher than the number of processors you have in the system. So if you have 1 Processor anything over 1 for an extended period of time is telling you there is a problem or you need to upgrade to a more powerful server.

[edit]
Re-reading your first post you said you don’t host that many websites so there is something going wrong here unless your few sites are heavy on database usage or get a ton of hits. I have a single CPU and my 15 min load average (average over a few months) is about 0.5. It also looks like your CPU is running at a constant 25% which also seems to raise a red flag on a lite system.

Have you tried running rkhunter (http://www.rootkit.nl)?

Here is a link to a sago forum about protecting your server:
http://sagonet.com/forums/showthread.php?t=2276&highlight=rkhunter

system · May 10, 2005, 4:29pm

Justin,

I ran rkhunter and during the tests, everything was ok except for that:

Check: SSH
Searching for sshd_config…
Found /etc/ssh/sshd_config
Checking for allowed root login… Watch out Root login possible. Possible risk!
info:
Hint: See logfile for more information about this issue
Checking for allowed protocols… [ Warning (SSH v1 allowed) ]

And thank you for the link on sago forums, i’m gonna give a try to all these tools, preventing is better than fixing

system · May 10, 2005, 4:34pm

Pascal,

I looked in /var/log/messages and noticed this stuff:
Those ones plus some others:

May 8 21:10:02 h10-10 sshd(pam_unix)[7733]: authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=62.218.119.62 user=root
[…] user=mysql
[…] user=ftp
[…] user=apache
[…] user=nobody
[…] user=news
[…] user=games
[…] user=mail
[…] user=adm
[…] user=rpm
[…] user=operator
[…] user=mysql
May 8 21:15:51 h10-10 […] user=sshd

A few like this:

May 9 16:12:47 h10-10 httpd: httpd shutdown succeeded
May 9 16:12:49 h10-10 httpd: httpd: Could not determine the server’s fully qualified domain name, using 127.0.0.1 for ServerName
May 9 16:12:52 h10-10 httpd: httpd startup succeeded

Several like that:

May 9 12:02:55 h10-10 clamd[1915]: SelfCheck: Database modification detected. Forcing reload.
May 9 12:03:00 h10-10 clamd[1915]: Reading databases from /var/lib/clamav
May 9 12:03:02 h10-10 clamd[1915]: Database correctly reloaded (34306 viruses)
[…]
May 9 20:03:05 h10-10 clamd[1915]: Database correctly reloaded (34310 viruses)
[…]
May 10 00:03:04 h10-10 clamd[1915]: Database correctly reloaded (34341 viruses)
[…]
May 10 14:03:10 h10-10 clamd[1915]: Database correctly reloaded (34344 viruses)

I checked /var/lib/clamav and there are 2 files:
daily.cvd (140 kb)
main.cvd (2 mb)

Concerning the dmesg log, i don’t understand it
so here it is: http://www.publidev.com/iworx/dmesg

I also checked the /var/log/httpd/access_log and i saw that (only one attempt in more than 2 days):

218.93.145.239 - - [08/May/2005:16:21:36 +0200] “POST /_vti_bin/_vti_aut/fp30reg.dll HTTP/1.1” 404 306

I also checked the /var/log/maillog and it looks interesting, its size is 8 mb (in ~2 full days), and there are errors at this frequency:

May 11 01:14:20 h10-10 maildrop[2893]: Unable to deliver to mailbox.
May 11 01:14:20 h10-10 maildrop[2932]: Unable to deliver to mailbox.
May 11 01:14:21 h10-10 maildrop[2941]: Unable to deliver to mailbox.
May 11 01:14:23 h10-10 maildrop[3007]: Unable to deliver to mailbox.
May 11 01:14:27 h10-10 maildrop[3142]: Unable to deliver to mailbox.
May 11 01:14:29 h10-10 maildrop[3230]: Unable to deliver to mailbox.
May 11 01:14:30 h10-10 maildrop[3246]: Unable to deliver to mailbox.

I also find this log /var/log/secure : http://www.publidev.com/iworx/secure

And i checked that all my databases were ok, and they are (anyway i don’t have so much)

My humble conclusion (remember that i’m not a pro): The problem is coming from the mails, but how to fix it???

Here are all the files you might want to see:
http://www.publidev.com/iworx/messages
http://www.publidev.com/iworx/dmesg
http://www.publidev.com/iworx/secure
http://www.publidev.com/iworx/rkhunter

I hope i did everything i had to do to help you helping me!
Thanks a lot to all of those who helped and will help me with this #&!@ problem

Paul · May 10, 2005, 4:59pm

Hi Roman,

Does that e-mail box you mentioned in your first post contain a ton of messages again? Are the messages all spam, or something else?

Paul

system · May 10, 2005, 6:07pm

Hi,
No because I did not recreate any mailbox for that domain on my server!!
(there is 0 mailbox of that domain, i even deleted the postmaster@xxx which is set by default)

And in my previous problem the messages were all coming from mailer-daemon@h10-10 (which is my server)
here is an example:
http://www.publidev.com/iworx/samplemail.txt
(I didn’t do spam and i don’t know none of the emails listed in this message)

Justec · May 11, 2005, 7:05am

Looking at you link, it seems like you mailbox is filling up with bounced bounced emails. Someone trys to send spam or something to a bad address on your server. Qmail then tries to send a message back to the sender saying that the message failed. Then that message is coming back to you b/c the sender’s address is bad as well.

First thing to try is to turn off bounce messages in SiteWorx. This way Qmail will just drop the bad email silently and will not try to send out a bunch of failure notices.

Another thing you can do to see if this is a problem is to look at the Qmail que. I isntalled a program called qmHandle which is a command line program for looking at Qmail que. You can do a qmHandle -l to see whats in the que and also delete things in the que.

system · May 11, 2005, 8:53am

Hi Justin,
I downloaded your program but when i want to read the queue it list nothing (i mean it looks frozen)

So I went into my directories and i think the folder which is posing problem is this one:
/var/qmail/queue/todo

It’s 95mb but all the files contain 1 line !

I can’t say more as i can’t list this directory… But i’m currently deleting its content. By the way, how to delete all the files in a directory with the command line? (because i’m doing it by SFTP

system · May 11, 2005, 11:47am

Hum that wasn’t from this directory, but i’m currently investigating, i’ll keep you informed !
Any idea on how to turn off “bounce” for everything with qmail?
Because i did it for my main domains in interworx, but there are still new files coming every second in the “todo” directory
And i have around 100 domains using at least this server as a dns and around 50 siteworx accounts… I don’t have nthe courage to go to each siteworx account do deactivate bounce and activate spamassassin

system · May 11, 2005, 1:35pm

I think i found it, i’m currently deleting the /var/qmail/queue/remote directory since a few hours and so far i have 80% free inodes (gained 13%) and the deletion process is still not finished :eek:

But i still need to know how to disable the bounce feature so no siteworx account can use it on my server…
And how to turn on spamassassin on all my siteworx accounts at once?

Thanks to all for your help :rolleyes:

system · May 11, 2005, 2:38pm

I did 5 minutes ago what paul explained here:
http://interworx.info/forums/archive/index.php/t-388.html

Just for kicks, edit the /var/qmail/control/timeoutremote file, and change the number to 100. Then restart the smtp service with

service smtp restart

And so far i have 1 mail in /var/qmail/queue/bounce

By the way, i installed a few applications from tha sago thread that justec gave me: APF, BFD and LES, so now i might be more secured

Justec · May 11, 2005, 3:08pm

It may just be messages that were already in the que. If you are unsure if you have maually deleted the que then you will have to wait up to 7 days to know if the bounces are truly off

system · May 11, 2005, 3:09pm

Ok,
So i’ll wait and i’ll see if i have less bounce messages, less load average, etc…

Thank you so much for all your help !!! :rolleyes: :rolleyes:

Lack of inodes / Ghost files?

Message Size Distribution: Range # Msgs KBytes 0 - 10k 0 0 10k - 20k 0 0 20k - 50k 0 0 50k - 100k 0 0 100k - 500k 0 0 500k - 1Mb 0 0 1Mb - 2Mb 0 0 2Mb - 5Mb 0 0 5Mb - 10Mb 0 0 10Mb+ 0 0

Message Size Distribution:
Range # Msgs KBytes
0 - 10k 0 0
10k - 20k 0 0
20k - 50k 0 0
50k - 100k 0 0
100k - 500k 0 0
500k - 1Mb 0 0
1Mb - 2Mb 0 0
2Mb - 5Mb 0 0
5Mb - 10Mb 0 0
10Mb+ 0 0