Lessons learned aren’t always lessons remembered
There’s a useful methodology called root cause analysis (or shoot the innocents from the perspective of most IT staffs) that is supposed to determine three things regarding an event of interest (typically an outage):
1. What actually happened?
2. How did it happen?
3. Why did it happen?
Lots of things get in the way of root cause analysis including a lack of full disclosure by the IT staff involved in the event and failures by executive management to demand excellence from their IT Management and staff including consequences for what I term “self-inflicted gunshot wounds”.
The truly tragic outcome is that even after the “Lessons Learned” documents and e-mails are generated, they rarely result in “Lessons Remembered”.
I was reminded of this after participating in a Root Cause Analysis of a major systems failure for a customer several months ago that took days to bring back online. You would think this would leave an indelible mark or perhaps even an ugly scar as a reminder in the organization? (Hint: You’d be wrong.)
Fixing it the second time took only a few minutes. However, it was still a preventable outage that will happen again because organizations lack the management discipline to convert a lesson learned into a lesson remembered.
October 3rd, 2006 at 1:47 am
[...] Predictably, this lesson has already been learned but not remembered. As of this writing the laws of thermodynamics have not been superceded. This means that if you want to reduce the consumption of electricity in the data center, you need to more efficiently dissipate the heat. This means abandoning air-side cooling in favor of something that used to cool those large mainframes quite efficiently – chilled water. [...]