So last night (Sunday) I went by our Datacenter to install two BBU (Battery-Backup-Units) in two of our storage servers for mail. These BBU units are nice for when power goes out and you dont loose data that’s in your write cache. Anyway, we have 3ware 9500 RAID cards.
So, I installed the BBUs. Which was kind of tight with the LED monitoring cables (which is a whole different story), came home.. about midnight I started the BBU test which disabled write cache on the card while it drains the battery..then it’ll recharge the battery after it’s done and then re-enable write caching.
Around 6am I woke up, as usual, and checked the status on both machines. Both were working fine and still ‘testing’.. went back to bed. around 9am I got a text message from a co-worker saying mail was down. Sure enough ‘ubrique’ one of the servers was dead in the water. console showed some SCSI (sata raid) errors. Rebooting remotely the machine did not come back up.. nadda..
Driving over was a pain. But I ended up swapping cards.. finding out FreeBSD 5.4 didnt have 3ware 9550 drivers working right yet out of the box.. and brought down ‘guadix’ another of our servers to get the storage server back up.. anyway, overnighting a new 9500 card from ASA. hopefully, this wont happen like this again . I’m looking at getting a cold-storage 9500 card.. as well as a diskless backup server so I could swap drives in a pinch.
Here’s to a fun morning.





Leave a Reply
You must be logged in to post a comment.