We replaced some legacy hardware with software implementations a month ago. This week customer start compaining about corrupted data. We find out our new software has a bug and sends configuration updates to the wrong destinations. Worked late last nite, and started early this morning. The incident rate was less than 1 percent of traffic and hard to isolate.
I couldn't let the guys go into the weekend dealing with this level of issue. I had the data center guys piece back together the legacy hardware as a backup plan. By the end of the day today, we hadn't found the root cause of the problem, and switched back to legacy equipment. Overloaded, but working acurately. We can sleep tonite. Solve the problem next week.
We corrupted several systems by sending data to the wrong locations. Been on the phone all day long doing damage control Problem solved for now, will have to piece it all back together next week.
Kudos to data center guys that keep old equipment laying around. Saved us today.
COMMENTS
-