Quote:
Originally Posted by Terra
Bootstrapping an entire data center from a full cold outage, however this time around the new cold bootstrap orchestration has helped quite a bit in smoothing out the proper startup ordering.
We currently have most all services restored except for the following:
MX Load Balancers (causing email to be offline)
MYSQL07
MYSQL08
MYSQL10
These servers have had their power supplies blown out from surge and we are working on a solution. The main issue is the tech supply chain has been slow as molasses and/or out of stock/backorder and the spare power supplies are not in yet.
The MX Load Balancers (MXLB), are what is causing email to be offline as they front and balance all backend email traffic to the POP Toasters. We are currently working on a virtualized workaround so that I can easily shift this functionality in case of a primary failure. Due to it being such a critical piece of backend infrastructure, we've been hesitant to virtualize this because the underlying CNI networking required component was still in beta. I have taken another look and it appears the required component is now ready for prime time. We are currently making the necessary changes to the core switches, and running preliminary tests to see if it will work reliably. Once that is deemed reliable, the new images will be built and wired in to the network.
Our sincerest apologies for the outage and we are working as fast (hair on fire) as humanly possible to resolve the remaining issues. Thankfully full cold outages are incredibly rare, but if it should ever rear its ugly head, it usually releases a handful of unforeseen/unpredictable gremlins that don't want to be caught and we have to capture them on a case by case basis.
|
Fascinating.
WHEN CAN WE EXPECT EMAIL BACK?
(This is what people want to know.)
A reasonable estimate? A wild guess? One of each? ANYTHING?!