Lessons noted but not heeded when precursor events and anomalies occurred have oftentimes been the culprit of historic mishaps.
Editor’s Note: NASA’s Office of the Chief Engineer and the Human Exploration and Operations Mission Directorate sponsored the Human Spaceflight Knowledge Sharing Forum in November 2016. Select individuals responsible for shaping NASA’s future over the next 10 to 20 years focused on technical best practices and lessons learned from successful and unsuccessful human spaceflight missions. This is part of a series of articles recapping lessons learned and knowledge shared by these individuals at the pilot knowledge sharing event.
Lessons not learned and not effectively acted upon are to blame in deadly mishaps across the globe. In his “Common Threads Among Catastrophic Mishaps” presentation at the Human Spaceflight Knowledge Sharing Forum, Brian Hughitt, Technical Fellow, Quality Engineering within NASA’s Office of Safety and Mission Assurance, examined causes of a wide range of disasters.
The most common thread among catastrophic mishaps studied, according to Hughitt, is lessons not learned – and consequently not recognized for implementation of corrective actions that would have prevented future missteps. The second most common thread is material control inadequacies, where a simple material failure or inadequacy resulted in a fatal mishap.
Related Video: Human Spaceflight Knowledge Sharing Forum session on Common Threads Among Catastrophic Mishaps |
The Big Dig Tunnel Collapse
Recognized as the largest, most complex and technologically challenging highway project in U.S. history, the Central Artery/Tunnel Project, unofficially known as the Big Dig, was designed to significantly reduce Boston traffic congestion.
“If you look at the metrics, they’re just hard to fathom. Five miles of tunnels. 200 bridges in a densely populated urban area. But the metric I’d like everybody to take particular note of is that in addition to all these amazing things that needed to be done, the project was billions of dollars over cost and years behind schedule,” said Hughitt. “Decisions were made that were very mindful of those schedule and cost pressures.”
In September 1999, a construction worker noticed some of the fasteners in the tunnel ceiling had started to pull out. Engineers performed proof testing on some of the fasteners that had pulled out, observed unexpected behavior, and called in the fastener supplier, who said the problem was with the concrete or cleaning procedures and took no corrective action. Under the direction of the project manager, heavy load testing and finite element analysis were performed to verify design assumptions and everything checked out. A design manager, structural engineer, and quality inspector all pointed out, however, that a key piece of information – the cause of the anchor failure and how the repair would fix it – was missing.
Brian HughittPhoto Credit: NASA
“It is mind-boggling to me that when the inspector noticed that previously tested fasteners had slipped out, they didn’t perform 100 percent inspection. That would have been quick, easy and cheap to do,” said Hughitt. “They didn’t treat the failures systemically, but as a bunch of isolated cases.”
In July 2006, the fastener issue caused a section of the I-90 connector tunnel ceiling to detach from the tunnel roof, sending 26 tons of concrete and suspension hardware onto a vehicle and fatally crushing a passenger. The National Transportation Safety Board (NTSB) listed the probable cause of the tunnel ceiling collapse as the contractors’ use of an inappropriate epoxy formulation due to their failure to identify potential creep in the anchor adhesive, which resulted from a general lack of understanding and knowledge in the construction community about creep in adhesive anchoring systems.
Hughitt said creep was not an unknown phenomenon at that time, but that the specific individuals involved in the megaproject didn’t know anything about it. He listed several contributing causal factors of the design vulnerability:
- The ceiling was held in place by rods and fasteners, but didn’t need to be as most ceilings constructed at that time had continuous ceilings. Initially, lightweight, laminate ceiling panels were planned, but concrete was later approved to save money.
- Engineers recommended mechanical fastening systems, but epoxy was used instead, and the glued-in studs failed.
- Corporately, the supplier manufacturer knew all about creep, but the individuals called to the scene to investigate the issue prior to the mishap had no knowledge of it.
“The knowledge had not been captured and transferred to the right people,” said Hughitt. “And no matter what they did — all the best installation practices, all the proof testing in the world — it wouldn’t have fixed this problem, because it was the wrong solution.”
Hughitt said that in hindsight, which is 20/20, it was inevitable that ceiling panels were eventually going to detach, but identified cognitive dissonance as one of the explanations for why persons were blinded to this preventable outcome.
The Big Dig during construction.Photo credit: adm
“We have our own experiences and knowledge. And everything we see, we filter through our knowledge and experience base. And if you see something that just doesn’t align with that, it doesn’t compute,” said Hughitt. “It’s beyond your reality. You filter it out and come up with explanations that do fit your knowledge and reality base. And that’s what occurred.”
McDonnell Douglas DC-10
When Turkish Airlines Flight 981 crashed into the Ermenonville Forest outside Paris in 1974, killing all 346 people on board, it was at the time the deadliest plane crash in aviation history. An improperly secured cargo door separated from the plane, causing an explosive decompression that severed cables necessary to control the aircraft. While the approximate cause was determined to be a faulty latch, Hughitt noted that multiple, interrelated causal factors were to blame. A baggage handler of normal strength had pushed the handle fully down, thinking he had secured the door, when he had, actually, only bent the internal bars and rods out of shape.
Hughitt said the door configuration, which used an outward opening hatch door design so that more paying passengers could fit inside the McDonnell Douglas DC-10, had design vulnerabilities. He noted that if proper safety design principles had been in place, the plane could not have taken off. But once the hatch door handle was down, the cockpit indicator light went off.
Hughitt pointed out that unethical behavior also occurred. McDonnell Douglas subcontractor Convair performed a failure mode effects analysis that definitively showed the deadly consequence of a cargo door latch failure, but the analysis never made it to the FAA. McDonnell Douglas attributed the incident almost entirely to human failure, but an earlier NTSB accident investigation clearly determined that design characteristics of DC-10 latch mechanisms permitted the door to appear to be closed when, in fact, they were not fully engaged and the lock pins were not in place.
“It gets almost into the realm of disbelief,” said Hughitt. Significant issues with the latch mechanisms had been documented, and the NTSB had provided a report on the known design vulnerabilities. The FAA proceeded to write an Airworthiness Directive based on NTSB recommendations, which would have effectively grounded the DC-10 fleet. However, the President of McDonnell Douglas persuaded the FAA Administrator to soften the requirements. The company failed to act upon even the less stringent requirements at the time of the mishap – measures that Hughitt noted would have prevented the accident.
USS Thresher and Apollo 1
Most Common Threads Among Catastrophic Mishaps
- Lessons not learned
- Failure to control critical material items
- Vulnerable design
- Workmanship shortcomings
- Process control failures
- Fraud
Hughitt shared observations of other mishaps, including the USS Thresher, lost at sea due to improperly fabricated silver-brazed pipe joints that resulted in the United States’ greatest single submarine disaster, and Apollo 1, where astronauts Virgil Grissom, Edward White and Roger Chaffee lost their lives when a fire broke out in the command module during a preflight test.
Hughitt recalled being astounded as a young quality engineer 35 years ago when reading through the report of the April 1963 USS Thresher submarine sinking, saying he hadn’t known at the time that such a causal chain could exist. He initially thought the compounding and overlapping of the series of events that doomed the Thresher seemed remarkably unfair and unlikely, but learned later in his career after studying other disasters that it’s not that uncommon in highly complex systems for multiple causal factors to all align.
“Both the Thresher and Apollo 1 were brand new platforms, first in their class. “Technical innovations and advancements are vitally important, but they come with risk,” said Hughitt. “When something is new, your safety senses ought to be ‘tingling.’ It’s unproven. And it’s virtually impossible to fully prove highly complex systems this new. You can do as many analyses and tests as possible, but empirical evidence, demonstrated reliability — there’s no substitute for that. Always fly like you test, and test like you fly.”
Related Resources:
The Big Dig: Learning from a Mega Project (ASK Magazine)