![]() ![]() |
Dec 4 2008, 03:57 PM
Post
#1
|
|
|
Administrator Group: Administrators Posts: 485 Joined: 25-January 06 From: Colorado Member No.: 1 |
We have had many issues with Galahad this morning. We have a hard drive that is failing in the RAID5 array which is causing services to fail and have high load on the box. The main service that is affected by this is the mail service.
We are looking at ways to resolve this ASAP but we may have to take the server offline for 2 hours in order to get the mail service back up. I will keep this thread updated with any progress we have. |
|
|
|
Dec 4 2008, 09:28 PM
Post
#2
|
|
|
Administrator Group: Administrators Posts: 485 Joined: 25-January 06 From: Colorado Member No.: 1 |
We are rebooting the server and running a FSCK. After the FSCK is complete we will replace the bad drive and let the array rebuild. This will slow down services but everything should be back up and functional after this is complete.
|
|
|
|
Dec 4 2008, 10:04 PM
Post
#3
|
|
|
Administrator Group: Administrators Posts: 485 Joined: 25-January 06 From: Colorado Member No.: 1 |
The server is back up and responding. Everything should be back to normal. Expect high load for the next 3 - 6 hours while the RAID array rebuilds.
|
|
|
|
Dec 5 2008, 03:16 AM
Post
#4
|
|
|
Administrator Group: Administrators Posts: 485 Joined: 25-January 06 From: Colorado Member No.: 1 |
We are taking a precautionary backup of all of the data on the server before we replace the drive that caused all of these issues. After this is complete we will replace the drive in the server.
|
|
|
|
Dec 5 2008, 06:45 AM
Post
#5
|
|
|
Administrator Group: Administrators Posts: 485 Joined: 25-January 06 From: Colorado Member No.: 1 |
We will be taking the server down shortly to replace the drive. This downtime shouldn't take more then 15 minutes.
|
|
|
|
Dec 5 2008, 07:09 PM
Post
#6
|
|
|
Administrator Group: Administrators Posts: 485 Joined: 25-January 06 From: Colorado Member No.: 1 |
The hard drive has been replaced and everything is back to normal. If you are still experiencing issues please open a ticket or post in the forums!
|
|
|
|
Dec 6 2008, 05:06 AM
Post
#7
|
|
|
Administrator Group: Administrators Posts: 485 Joined: 25-January 06 From: Colorado Member No.: 1 |
We have seen some strange occurrences rebuilding the array with the new drive. The /home directory is back in read-only mode and we are seeing stripe errors on the array.
We are taking another backup of the data and we will be replacing the RAID card and all three drives. The downside to this is we will need to reload the server. After the backup is done we will need to take the server offline for 2.5 hours to replace the hardware and reload the OS. When the server comes back up we will load the backups and everything should be good to go. I cannot apologize enough for these issues that you have been experiencing the last 48 hours. We are working around the clock to bring the server back to normal and give you the quality service you are used to receiving. If you have any questions you can put in a ticket, post a question on the forum or message me using MSN (info in my profile). |
|
|
|
Dec 6 2008, 11:22 AM
Post
#8
|
|
|
Administrator Group: Administrators Posts: 485 Joined: 25-January 06 From: Colorado Member No.: 1 |
The previous backup attempt failed half way through. We restarted the backup and we are waiting for it to complete.
|
|
|
|
Dec 6 2008, 05:57 PM
Post
#9
|
|
|
Administrator Group: Administrators Posts: 485 Joined: 25-January 06 From: Colorado Member No.: 1 |
The backup is almost complete.
To minimize downtime we have ordered a completly new server. Instead of taking the server down for maintenance we will configure the new box and move everyone over to it. I will keep this thread updated with the progress. |
|
|
|
Dec 8 2008, 03:26 PM
Post
#10
|
|
|
Administrator Group: Administrators Posts: 485 Joined: 25-January 06 From: Colorado Member No.: 1 |
The backup has been complete and we started the transfer process to the new server. The trasfer processis about 90% done. If you see any strange problems with your account please submit a ticket or post in the forums.
|
|
|
|
Dec 9 2008, 03:52 PM
Post
#11
|
|
|
Administrator Group: Administrators Posts: 485 Joined: 25-January 06 From: Colorado Member No.: 1 |
Everyone has been moved to the new server (server1). It is identical in specs except the RAID card has been upgraded to a 5405 (main difference is 512MB onboard memory instead of 256.
I can't apologize enough for the issues this may have caused you. If you have any questions or problems don't hesitate to contact me. |
|
|
|
![]() ![]() |
| Lo-Fi Version | Time is now: 8th September 2010 - 11:00 PM |