Ok here is the story with the server over the past few days, its kind of technical.
Last week because of all the VOOM stuff we were having a record server loads here at SatelliteGuys, when I am working I always keep a SSH window opened and have TOP running on the machine so I can see how its doing.
When we were mega busy Apache was messing up big time and after a few hours it was eating mega memory, the load would go up so high that it could crash the server, however I would restart Apache before the levels got too high, no one noticed that I was constantly restarting apache, in restarting Apache there would only be a slight delay for a second or two while apache was restarted.
I knew that I couldnt sit in front of the server restarting the Apache server all day and night, so I contacted vBulletin with an optimization request, I gave them all our server information, mysql and apache config information, and let them work their magic.
They came back with some new MYSQL and APACHE settings to use which I applied. However this just made this worse. (eeek)
I contacted the vBulletin guys again who then contacted me privately they wanted to monitor the machine, which I let them do, they could see stuff like server load, cache hits, free memory stuff like that.
After they monitored us for awhile they came back with some more MySQL and Apache tweeks. I applied those and things got worse. I let them know and I asked them again to look at the server, this was Friday Night, they loged in to the server at 10pm and finished about 2 hours later.
When they were done everything was OK the server was not any better but it was ok. They monitored it through the night, in the morning I got a note saying that it looked like it was time for a new server, as our IDE hard drives were causinig the problems since we were serving over 125 databse queries a second!
Without thinking twice I called LER (Larry) our server Guru (aka Server Weenie) and asked him to help me pick out a new server. After about 2 hours on the phone and searching forums and looking at specs we decided on our new server. I placed my order for the server there and then.
The board ran ok (just ok not great and not bad) on Saturday. The Saturday Night / Sunday morning something went wrong the board went down at about 1:30am eastern time. at Midnight our server updates all its server and cpanel files. After thats done the backup kicks in. I didn't realize anything was wrong with the server until I woke up and looked at all the messages on my pager (the server is set to page me when there is problems) When I woke up everything was back up and running on its own.
It ran ok all day Sunday then Sunday NIght/ Monday morning the server screwed up again, this time I had the pager next to my bed so I woke up and was able to get us back online within a few moments. I stayed up to 4am working on the system trying to figure out what was wrong. I was happy that everything seemed normal and went to bed. Got up at 6 and got ready for work then went to work.
While at work I got 3 emails alerting me to database errors. I logged into the board and saw that everything was ok. Thought the errors we just some odd occurance and went back to work.
10 minutes later my email box filled up with 200 Database errors. Something was wrong. I took a look at the errors and saw that they all had the same mysql error code, so I went to vBulletin and looked up the error code. That when I saw what the problem was... Somehow (eaither by vBulletin why they were updating things on Friday night) or by Cpanel in its attempt to keep all the software current updated our MySQL to MySQL 4.1.8. vBulletin does not work well with MySQL 4.1.8 they tell you NOT to run VB with that version of MySQL.
I made a dump of our database just in case something happened in the rollback.
So I needed to shut down the board and revert us back to MySQL 4.0.23, this took about 15 minutes after I was done I checked the board and all appeared normal so I put everything back online.
After being online for about 20 minutes, I started getting some alerts of PHP errors, so I reinstalled PHP, after PHP was installed I also reinstalled our PHP accelerator. We were back online again and running good.
Almost 2 hours went by and exactly at the top of the hour the MySQL errors started pouring in again. In checking vbulletin I learned I should have ran the MySQL repair utility. A cron job went off at the top of the hour and somehow it corupted a table in the database.
So I took down the board again to fix the currupted database. I ran the database repair utility and it was going SLOW... And then it just locked up... I wasnt sure what was going on. I was in the CPANEL WHM control panel screen and for some reason I clicked on our Hard Drive status and saw my problem, our /tmp directory was full, for some reason when the server was setup the /tmp was only made to be 250 MB in size.
So I went into the /tmp directory and started deleting files. as I started deleting files the database utility started working again. Soon I had the /tmp directory almost empty and the database repair utility was running and quickly finished. It found one currupt table and fixed it. We were back online again.
The system was still slow but that was because vBulletin was trying to send me over 20,000 MySQL error messages (ugg) After those messages started getting cleared out the speed got better and better.
At 7pm I checked the server and all was fine, I was happy but I knew I had to problem still, my problem is when we move over to the new server the cpanel software automatically moves everything to the new server, the problem was with a 250 Meg TMP folder we would never move to the new server.
So I contacted Larry (LER) who turned our 250 MB /TMP folder to a 10 gigabyte folder! Now we should have no problem automatically moving things to our new server!
In making the /tmp folder bigger he had to take the site down for about 45 seconds. However now we have a 10gig /tmp drive.
Our new server is almost ready, we got to log into it tonight, it still needs to be performance tuned and still needs 2 more gig of memory, once it is finished we will start the moving process.
We need to have everything moved by Tuesday, so its going to be a long busy week, but when all is said and done we will be on a new fast server and these server load issues because of IDE hard drives will be behind us.
Thanks for beinig SatelliteGuys!