While you do provide some noteworthy examples, I fail to see how they would not have been avoided by simply keeping all process docs in some kind of revision control system and making sure they were updated with each new discovery. They seem like fairly straightforward cases of failure to update the documentation. RCSes make this even easier nowadays.
When I manage teams, we create a "definition of done" and make sure that no deliverables are considered complete unless each item has been checked off. A common item on the checklist is "Documentation has been updated for any relevant changes in software behavior or expectations." Other common items would be "Any publicly-consumed APIs have been documented, with common usage examples," and "All important non-obvious architectural decisions have been well-documented with comments."
Something as basic as a missing ingredient in a recipe would be a catastrophic documentation failure.
When FOGBANK was being manufactured by the USG, they didn't know that the "impurity" was actually the key ingredient at some critical threshold. They couldn't document it by default.
It's also why tens of millions of dollars were spent reverse engineering the original recipe, because the attempts to recreate led to the realization that they didn't understand it.
There's also a tradeoff between cost of documentation v. the fidelity of documentation. It costs resources to document a project, and there's often some cut off at which point you stop. For e.g., on the Boeing 737-800, it is critical that you include a washer with the nut that connects the slat track just so, because otherwise it comes loose. So you should double check it when you perform maintenance.
This little tidbit was known by people who'd worked closely with the airplane, but it wasn't well known outside of their local context. On the China Airlines Flight 120, the washer fell off while they were doing routine maintenance, which then caused the bolt to come loose which then knocked out another assembly, that then punctured the main track of the slat, which then caused fuel to leak, which then culminated in an explosion.
Thankfully, no one was harmed, but it makes me wonder; if you or I had been been working on documenting every single aspect of this aircraft, would we have included the note that the washer is extremely important, when it's industry standard that any nut and bolt assembly should never be fastened without a washer? And it is standard practice to always check?
I know that I wouldn't have. In this one case, for this particular sub-assembly, it turns out that the bolt was in a place that was hard to check/reach;
> With regard to the detachment of the washer, it is considered probable that the following factors contributed to this: Despite the fact that the nut was in a location difficult to access or inspect during maintenance, neither the manufacturer nor the airline paid sufficient attention to this when preparing the service letter and engineering order job card, respectively. Also, neither the maintenance operator nor the job supervisor reported the difficulty of the job to superiors.
There were smart, industrious people involved at every step of the long chain of events that led to the incident, and they all missed that it was important to pass on this piece of information, "On slat sub-assembly $A, the nut is hard to reach, but it is extremely important to make sure that there's a washer on it."
I think it's a worthy problem to consider how we can, in the future, preserve our knowledge with its necessary context so that we can pass it around easily amongst ourselves. It could be a critical step change for our species.
Are you saying that in FOGBANK, they only successfully produced the recipe ONCE? I don't see how they could have mass produced this without resolving that issue.
Your point about the washer is more interesting. Perhaps this practice is different when it comes to hardware manufacturing, but in software, I would say anything that could have effects that disastrous would be considered worthy of documentation. If using washers with nuts is standard practice, it should at least be mentioned once in some section on basic assumptions / prerequisites.
I can see though, how catastrophic failures like this sometimes need to happen before people see the importance of failing to document assumptions. One can hope that these experiences become part of the general awareness that gets transmitted to engineers during their education. The Therac-25 disaster, for example, is a common lesson taught in most human-computer interaction courses.
Really recommend the book "The Shock of the Old" some great examples of forgotten tech but also long this line of forgetting, there is also the persistence of many old technologies, far beyond their narrative expiry.
The chapter on flush pop rivetting in "What engineers know and how they know it: A History of the aircraft industry 1918 to 2002" by ... Damn I forgot..
If you go to youtube and look up some videos about restoring historical aircraft, sometimes you can see this process of lost details in action. For example they have a machine to restore and they don't know what a certain part of it is, or they don't know exactly how the pieces of the plane were attached to one another. And the restorations of these planes end up costing millions of dollars. Several millions seem to have been spent on getting De Havilland Mosquitos into the air.
The condition of Axis aeroplanes is even worse. There are some remaining Heinkel 111s but none flying any more, and there are just wrecks of the Focke-Wulf Condor, which they barely knew how to put back together. The knowledge of these aeroplanes has simply disappeared, leaving people trying to puzzle them out from wrecks found at the bottom of fjords.
"Du hast den Farbfilm vergessen" interestingly was shunned by Nina Hagen herself when Angela Merkel had it play, as the text writer was a convicted sexual predator and the song about domestic abuse as well.
"Everything hurts so much" ... "Do that again, Micha, and I leave" ... "Everything blue and white and green and not true anymore later" ...
While you do provide some noteworthy examples, I fail to see how they would not have been avoided by simply keeping all process docs in some kind of revision control system and making sure they were updated with each new discovery. They seem like fairly straightforward cases of failure to update the documentation. RCSes make this even easier nowadays.
When I manage teams, we create a "definition of done" and make sure that no deliverables are considered complete unless each item has been checked off. A common item on the checklist is "Documentation has been updated for any relevant changes in software behavior or expectations." Other common items would be "Any publicly-consumed APIs have been documented, with common usage examples," and "All important non-obvious architectural decisions have been well-documented with comments."
Something as basic as a missing ingredient in a recipe would be a catastrophic documentation failure.
Hey Carl,
When FOGBANK was being manufactured by the USG, they didn't know that the "impurity" was actually the key ingredient at some critical threshold. They couldn't document it by default.
It's also why tens of millions of dollars were spent reverse engineering the original recipe, because the attempts to recreate led to the realization that they didn't understand it.
There's also a tradeoff between cost of documentation v. the fidelity of documentation. It costs resources to document a project, and there's often some cut off at which point you stop. For e.g., on the Boeing 737-800, it is critical that you include a washer with the nut that connects the slat track just so, because otherwise it comes loose. So you should double check it when you perform maintenance.
This little tidbit was known by people who'd worked closely with the airplane, but it wasn't well known outside of their local context. On the China Airlines Flight 120, the washer fell off while they were doing routine maintenance, which then caused the bolt to come loose which then knocked out another assembly, that then punctured the main track of the slat, which then caused fuel to leak, which then culminated in an explosion.
Thankfully, no one was harmed, but it makes me wonder; if you or I had been been working on documenting every single aspect of this aircraft, would we have included the note that the washer is extremely important, when it's industry standard that any nut and bolt assembly should never be fastened without a washer? And it is standard practice to always check?
I know that I wouldn't have. In this one case, for this particular sub-assembly, it turns out that the bolt was in a place that was hard to check/reach;
> With regard to the detachment of the washer, it is considered probable that the following factors contributed to this: Despite the fact that the nut was in a location difficult to access or inspect during maintenance, neither the manufacturer nor the airline paid sufficient attention to this when preparing the service letter and engineering order job card, respectively. Also, neither the maintenance operator nor the job supervisor reported the difficulty of the job to superiors.
https://en.wikipedia.org/wiki/China_Airlines_Flight_120
There were smart, industrious people involved at every step of the long chain of events that led to the incident, and they all missed that it was important to pass on this piece of information, "On slat sub-assembly $A, the nut is hard to reach, but it is extremely important to make sure that there's a washer on it."
I think it's a worthy problem to consider how we can, in the future, preserve our knowledge with its necessary context so that we can pass it around easily amongst ourselves. It could be a critical step change for our species.
Are you saying that in FOGBANK, they only successfully produced the recipe ONCE? I don't see how they could have mass produced this without resolving that issue.
Your point about the washer is more interesting. Perhaps this practice is different when it comes to hardware manufacturing, but in software, I would say anything that could have effects that disastrous would be considered worthy of documentation. If using washers with nuts is standard practice, it should at least be mentioned once in some section on basic assumptions / prerequisites.
I can see though, how catastrophic failures like this sometimes need to happen before people see the importance of failing to document assumptions. One can hope that these experiences become part of the general awareness that gets transmitted to engineers during their education. The Therac-25 disaster, for example, is a common lesson taught in most human-computer interaction courses.
Really recommend the book "The Shock of the Old" some great examples of forgotten tech but also long this line of forgetting, there is also the persistence of many old technologies, far beyond their narrative expiry.
https://en.wikipedia.org/wiki/The_Shock_of_the_Old
The chapter on flush pop rivetting in "What engineers know and how they know it: A History of the aircraft industry 1918 to 2002" by ... Damn I forgot..
If you go to youtube and look up some videos about restoring historical aircraft, sometimes you can see this process of lost details in action. For example they have a machine to restore and they don't know what a certain part of it is, or they don't know exactly how the pieces of the plane were attached to one another. And the restorations of these planes end up costing millions of dollars. Several millions seem to have been spent on getting De Havilland Mosquitos into the air.
The condition of Axis aeroplanes is even worse. There are some remaining Heinkel 111s but none flying any more, and there are just wrecks of the Focke-Wulf Condor, which they barely knew how to put back together. The knowledge of these aeroplanes has simply disappeared, leaving people trying to puzzle them out from wrecks found at the bottom of fjords.
Those spherical cows were of uniform density, by the way. I don't know why this popped into my mind, but the metaphor appears incomplete without this.
Thank you for spotting it! I shall correct this slight. :)
"Du hast den Farbfilm vergessen" interestingly was shunned by Nina Hagen herself when Angela Merkel had it play, as the text writer was a convicted sexual predator and the song about domestic abuse as well.
"Everything hurts so much" ... "Do that again, Micha, and I leave" ... "Everything blue and white and green and not true anymore later" ...