site outage?

Status
Not open for further replies.
Sorry was moving the server to Cheyenne and this huge storm hit...

Just kidding..

I dont know what caused it, I just called Larry. Could be the apache module we added the other night causing the issue.

The stupid paging service didnt page me with a problem until after I got the site back online. (grrr)
 
And I was at a PostgreSQL user group meeting. It seems like it's back. I really need to go over the
web server with a fine tooth comb, and also upgrade it to FreeBSD 7.
 
ugh, site seems tobe getting worse rather than better
yet another lengthy outage today
what needs tobe done to keep the outages to a minimum rather than maximum?(as it seems now)

i had to read threads at d**talk for awhile and i know om not wanted over there
 
We had an issue with the database server where it's drive array barf'd, and caused the database to be inaccessable. Because it crashed, we needed to run:
1) fsck on ALL filesystems
2) myisamchk -r on ALL database tables

This takes a while with all the disk we have.

Plus, the RAID controller is rebuilding the array.

Basically, stuff happens.
 
This was a combo outtage... (Hopefully I explain this one correctly)

Problem #1 was one network card stopped responding on the database server. The ISP needed to reboot the server.

After the reboot we ran a repair / check on our database to make sure everything was ok since the server wasn't gently rebooted.

After that was done we opened the site.. however there was an issue.. the raid drives were starting to rebuild themselves. With the site online and the server was getting hammered all at once the file system check and remirroring was running very VERY slow.

So we took the webserver offline and also took down the database services so that the drives could catch up and rebuild the mirrors. Once we did that the servers rebuilt itself within 15 minutes. :)

The outage on Saturday was planned down time. (Although it did take longer then we thought do to a problem with the cpanel sub system.)

All should be good now (knock on wood)
 
The outage on Saturday was planned down time. (Although it did take longer then we thought do to a problem with the cpanel sub system.)

All should be good now (knock on wood)

Yeh, I was going go point that out if you didn't. This is the first "oops" outage that I know of for a long time.
 
Kicking the server would have restarted the rebuilding of the mirror again. This is one time kicking it would not have helped. :)
 
A) The server is in Dallas and Scott's in Hartford, CT
B) We did reboot it, but then needed to do due diligence to the database
C) this takes time with the amount of disk we have.
 
Status
Not open for further replies.

Closing gold members forum, good idea?

Search limitation ?

Users Who Are Viewing This Thread (Total: 0, Members: 0, Guests: 0)

Who Read This Thread (Total Members: 1)