Название: Judgment Calls
Автор: Thomas H. Davenport
Издательство: Ingram
Жанр: Экономика
isbn: 9781422183960
isbn:
She also points to problems in the deliberative process during the January 27 teleconferences. NASA managers at Kennedy and Marshall Space Flight Center could not see the engineers at Morton Thiokol who had concerns about the O-rings. They missed the “body language” that could have helped expressed the engineers' unease; they were unaware of the local conversations between calls that might have given them a better grasp of the technical issue. As a result, the level of the engineers' concern was not clearly communicated. In addition, NASA's technical culture tended to discount the engineers' partly intuitive argument about the dangers.
So the multiple causes of bad judgment in this disastrous case include reluctance to credit “bad news” that would thwart schedule and productivity goals, complacency resulting from a history of success, and ineffective communication.
Seventeen years later, the only other fatal shuttle accident—the Columbia—occurred. It is unclear whether the crew could have been saved if NASA had understood the damage to Columbia while it was in orbit, but the board that investigated the disaster attributed the agency's failure to try to assess possible damage to many of the same factors behind the Challenger decision, especially the complacency born of many successes and communication failures.
This second accident and the criticisms in the Columbia Accident Investigation Board report strengthened NASA's resolve to address the cultural as well as the procedural flaws responsible for those fatal errors, and multiple changes to both process and culture were instituted over time.4 Looking now at the flight readiness review—for STS-119, the Discovery mission originally scheduled for launch on February 19, 2009, and finally launched nearly a month later, on March 15—will illustrate what the space agency has done to ensure the soundness of its judgments about flight viability and safety.
The Flight Readiness Review
Today, the flight readiness reviews held at Kennedy Space Center before a scheduled launch date bring technical teams and managers together in one room, including representatives from three domains: program, engineering, and safety—about 150 people in the case of the STS-119 review. The importance of gathering them in one place, face-to-face, is clear when you contrast such a meeting with the Challenger teleconference, or any teleconference for that matter, where inattention, misunderstanding, and incomplete communication are common.
The FRR is preceded by a series of smaller team meetings and technical reviews to discuss and analyze issues that will come up in the formal FRR. There are likely to be fifty teams working on specific technologies, projects, and subsystems. These meetings are part of an ecology of decision-making redundancies, integrated tightly into an overall and well-orchestrated process of problem solving. Overlapping authorities and tasks increase the odds of exposing potential issues and uncertainties—they can't fall through the cracks if there are no cracks. This way of working, and the culture by which the entire process is facilitated, also gives early and ample opportunity for people to speak out when they see a problem. Mike Ryschkewitsch, NASA's chief engineer, says, “You know one of the things that NASA strongly emphasizes now is that any individual who works here, if they see something that doesn't look right, they have a responsibility to raise it, and they can raise it … for example, you have whole communities of experts throughout NASA whose whole life is about maximizing safety to the crew.”
In large part thanks to all that preliminary work, many FRRs are fairly routine. Problems have been identified, analyzed, and solved beforehand. Representatives of the teams that carried out that technical work present their results to the group as a whole and have the knowledge they need to answer the questions their colleagues may raise. The STS-119 review—which actually became a series of reviews—was unusual. The technical problem about the engine valve first noted was barely understood at the time of the first FRR and not resolved to the satisfaction of many participants at a second, marathon session. It took three meetings to arrive at “go” for launch. That “decision about the decision making” showed that the FRR is not a rubber stamp on a foregone conclusion; it demonstrated Ryschkewitsch's claim that people at NASA felt free to delay flights over technical concerns, putting flight safety ahead of schedule and productivity. Though the process of problem solving and decision making is well structured, the culture of dissent and open exchange balances and gives critical flexibility to what might otherwise be a dangerously rote activity.
The Problem of the Faulty Valve
The problem that faced the engineers and scientists who took part in the FRR for STS-119 came to light during the previous shuttle mission, STS-126. Shortly after that spacecraft, Endeavor, lifted off from Kennedy Space Center on November 14, 2008, flight controllers noticed an unexpected hydrogen flow increase related to one of the shuttle's three main engines. Because three control valves work together to maintain proper pressure in the hydrogen tank, the other valves compensated for the malfunction and the flight proceeded safely. But before another mission could fly, the shuttle team would need to understand why and how the problem occurred, whether it was likely to happen again, and just how dangerous a recurrence might be.
Bad weather in Florida forced Endeavor to land in California on November 30 and the shuttle was not returned to Kennedy until December 12, delaying examination of the faulty valve by almost two weeks. X-rays showed that a fragment of the valve's poppet (a tapered plug that moves up and down to regulate flow) had broken off. So the risks engineers had to consider included not only the kind of hydrogen flow anomaly they had seen on STS-126, but the possibility that a poppet fragment racing through propellant lines might rupture one of them. The level of risk depends on two factors: the likelihood of a problem happening and the seriousness of the consequences if it does. The consequences of a ruptured line would be disastrous, so the likelihood had to be extremely low to make the risk acceptable. The necessary technical analysis would have to have two major components: studying the valve to determine why the poppet broke, as a way of understanding the probability of a similar failure; and figuring out whether a poppet fragment was at all likely to breach the propellant system.
Because the valve is part of a system that included the shuttle, the main engines, and the external fuel tank, responsibility for understanding its failure lay with teams at the Johnson Space Center in Houston, the Marshall Space Flight Center in Huntsville, Alabama, and several NASA contractors, including a division of Boeing. They began work on these issues. The process proved challenging.
The first flight readiness review for STS-119 took place on February 3. It quickly became apparent that the technical teams did not yet understand the problems well enough to certify that the next shuttle spacecraft for this mission, Discovery, was ready to fly. Steve Altemus, director of engineering at Johnson Space Center, said, “We showed up at the first FRR and we're saying, ‘We don't have a clear understanding of the flow environment, so therefore we can't tell you what the likelihood of having this poppet piece come off will be. We have to get a better handle on the consequences of a particle release.’ ” The launch was rescheduled for February 22—overoptimistically, as it turned out—and the technical teams kept working.
They faced tricky problems. X-ray analysis had determined that the poppet failed because of high-cycle fatigue—that is, damage caused by repeated use. Unfortunately, these components were no longer manufactured and were in short supply, so the option of acquiring new, unstressed poppets did not exist. Given that fact, a reasonable approach could be to examine poppets for cracks that might indicate potential weakness; a poppet with no cracks seemed extremely unlikely to fail. But even electron microscopes could not reliably locate tiny cracks unless the poppets were polished first, and polishing subtly changed the hardware, invalidating its flight certification.
Trying to determine whether a poppet fragment might puncture a fuel line was made even more difficult СКАЧАТЬ