By Ann Over
The Space Communications and Networking (SCaN) Testbed (STB) is a flight and ground system delivered on a fast-track schedule by Glenn Research Center.
As its name suggests, the SCaN Testbed is a technical and programmatic advancement of communications technology that will pave the way f0r future space-communication architectures. It consists of reconfigurable software-defined radios that give mission planners the ability to change radio functionality while on orbit.
The STB team overcame many “schedule-killing” technical issues to meet the launch schedule. They succeeded thanks to careful schedule management, heroic technical efforts, and a bit of luck.
At the time of the preliminary design review (PDR), the team needed to design, fabricate, and assemble almost one thousand pounds of flight hardware and half a million lines of software code for shipment in less than two years. To add to the complications, two of three software-defined radios were obtained via cooperative agreements, which were not traditional contracts with incentivized deliveries. SCaN Testbed was the first International Space Station payload installed on the Express Logistics Carrier on orbit, and the Japanese H-II Transfer Vehicle (HTV) interface was a new design.
Schedule management did not come easy. Because STB was “just a technology project,” not one of the more visible science missions, the scope of the effort was readily underestimated. Just prior to September 2009, at the time of the PDR, the schedule had four months of “negative slack.” In other words, if all went right, STB would miss its scheduled ship date by four months.
At that time, the team adopted a new philosophy that infused the project for the rest of the development cycle. The schedule was optimized to fit the available hardware and software by challenging traditional NASA testing paradigms. For example, we were told you cannot do tracking and data-relay satellite system compatibility testing until the entire system is assembled. We conducted most of our compatibility testing before antennas were installed, accelerating that test by more than six months. That new approach created a positive two weeks of margin.
Two other management techniques were deployed to maintain the schedule. First, schedule risk consequences were analyzed and ranked in terms of weeks, rather than months, as for most NASA schedule risks. Second, an aggressive tracking, reporting, and corrective-action system was put in place. Weekly schedule meetings focused on the critical tasks, milestones, and interface points that were drivers for the next two weeks of work. If a date slipped, a corrective-action plan was presented and approved on the spot. A key tenet was delegation of authority and resources to the subproject managers (work breakdown structure leads) and their test managers for critical tests. This rigorous schedule management demanded discipline, hard work, and emotional perseverance. Many times it was one step forward, two back, and the team often worked overtime to get that “step” back. Finally, through it all, there was humor—you just had to laugh because maintaining the schedule seemed so impossible at times.
Most complex spaceflight developments have technical challenges and issues. STB had more than most, each one capable of killing the schedule. The schedule was certainly wounded more than once. The team was ultimately successful thanks to four factors: exemplary guidance and execution by the chief engineer; passionate commitment by the team; proactive reaching out to the wider community of experts to solve issues; and effective decision making by project management.
Resolving Killer Problems
There were dozens of major issues. These three are representative of the problems and our approach to solutions.
Problem 1: Requirements
The HTV carrier was in co-development with STB, so requirements at the interfaces were not well defined. To save schedule, the radio structural-load requirements were estimated to allow work on them to proceed, but they were set too low. Post-PDR, the first coupled-loads analysis showed the payload system, including the radios, had numerous negative structural margins. Also after PDR, the International Space Station carrier discovered a structural analytical issue that was corrected by multiplying all the launch loads by a factor of 1.6, making the problem that much harder.
The project recovered by conducting a structural test to validate the dynamic model, redesigning the thickness of radiator plates, adding a significant number of fasteners, working with radio vendors to increase loads tolerance via analysis and test, updating the model using a better carrier-interface model, and implementing force-limiting for testing and analysis to achieve flight certification—all hard, time-consuming, painful activities. Force-limiting is a way to concurrently simulate the acceleration and launch forces to avoid over-testing on a rigid mount; force-limiting for analysis is a state-of-the-art practice used to qualify structures when margins are tight.
To save costs, projects often use donated or heritage designs. STB was no different. Given the amount of heritage technology, we attempted to build system requirements from the subsystems, via a “bottom-up” approach. This led to issues later.
We used heritage “flight-qualified” designs from Lunar Reconnaissance Orbiter (LRO), including the traveling wave tube amplifier and the antenna pointing system (APS). The traveling wave tube amplifier, fortunately, was not an issue (a good-news story for heritage hardware). The special challenge was the APS, since it was not required to be safety critical for human-rating on LRO, nor was it designed for our structural or thermal environment requirements.
Recovery involved a significant redesign, including thirteen purchase-order modifications and cost growth from $3.4 million to $6.6 million. The entire system schedule was adjusted to accommodate a twelve-month delivery slip, including production of a high-fidelity vibration simulator for system vibration testing and other simulators for system performance testing. But the critical-path schedule never had downtime waiting for the APS; the vibration simulator was a significant cost, but it bought seven months of schedule to allow system vibration testing without the APS.
Here’s the takeaway: Beware of heritage flight-qualified hardware, especially for use in human-rated systems that generally have stricter requirements. Do not use a commercial purchase order if the heritage hardware needs to be modified. This procurement type is not designed for changes and, typically, the contractor takes on more risk and the government pays higher overhead.
Problem 2: SpaceWire
Given very high data-rate requirements and the other NASA successes with SpaceWire, STB chose it for the internal communications architecture. Several issues were uncovered during development. SpaceWire hardware is not robust and interface standards are not mature. The cables failed after simple transportation events, and at one point the high data rates worked but low rates didn’t.
Given that success of the project depended on SpaceWire functioning properly, we took several parallel actions to recover. STB conducted nondestructive and then destructive testing of the hardware at Glenn and the Naval Research Laboratory, and eventually rebuilt several cables (including an in-situ replacement while on a vibration table), rerouted cables to improve bend radii, and added padding for tie-downs. Resolution of the data-rate issue that involved both firmware and software was more elusive.
Two generations of tiger teams external to the project team were deployed to investigate performance issues. We consulted experts within NASA and in industry. Using systematic testing and analysis, STB eventually found the major cause of interface incompatibility between the firmware elements. These efforts lasted almost a year. During that time, system testing proceeded using the capability available with very little retesting required. (Retesting would have delayed shipment a year.) For example, a late field-programmable gate array (FPGA) upgrade was necessary after system thermal and electromagnetic-interference testing; recertification was accomplished with analysis and subsystem testing only.
Beware of heritage flight-qualified hardware, especially for use in human-rated systems that generally have stricter requirements.
Four major lessons: SpaceWire performance is great, but buyer beware, since the hardware is not robust and firmware/software interface standards/algorithms are not mature. Second, for any FPGA use, target FPGAs that are reliably reprogrammable without removal from the system. Third, when you run into a technical problem that threatens mission success, mobilize all available resources to solve it; include expertise outside the team. Finally, past success of components on other missions doesn’t mean you won’t have problems with them.
Problem 3: Safety
Meeting human-spaceflight safety requirements is one of the hardest engineering jobs at NASA. For STB, the Phase 0/1 flight-safety review was conducted after PDR when the avionics subsystem design was firm, limiting the design options to adequately address safety hazards identified during the review process. A significant new issue was identified for safety during Ka-band operations, requiring flight software to provide two controls to verify power was off when crew or vehicles were within line of sight. If the beam of the Ka-band antenna were to directly line up with an astronaut or vehicle, it could potentially cause personal injury or equipment failure.
In response, significant project resources were expended to modify the SpaceWire architecture to implement two verifiable and independent inhibits within a single central processing unit (CPU) and to develop the associated safety-critical software. Ultimately, STB became the first space station payload to demonstrate adequate control-path separation for two independent inhibits to be controlled by a single CPU. We implemented this capability incrementally to be able to meet the system testing schedule.
The lesson: Safety requirements should be part of the design process, including input and review external to the project by the applicable safety-certification group. For example, complete the Phase 0/1 flight-safety review before the designs are solidified. Reliance on safety-critical software for primary controls in a human-rated space environment is expensive and time consuming; hardware options are generally easier to design and verify.
Luck Matters, Too
The final factor in the success of the SCaN Testbed was luck. We were lucky that the launch date moved much later to give us time to reduce risk with more testing. We were lucky to have a supportive management within NASA and at the vendors, especially the NASA Headquarters SCaN Program Office and Glenn senior management, who supported us every step of the way, including finding the resources to fix the issues. Finally, we were extremely lucky to have such a dedicated, passionate team, who worked very hard for the mission and—more importantly—for each other.
When you run into a technical problem that threatens mission success, mobilize all available resources to solve it; include expertise outside the team.
The NASA and industry partnership proved up to the challenge of meeting an extremely tight schedule. The team overcame many issues to meet the HTV-3 launch schedule. Key to meeting schedule was to make progress with the available functionality and to assign a talented and dedicated team. The testbed was successfully launched from Japan on July 20, 2012, and installed on the International Space Station. Initial checkout operations were also successful and science operations are expected to begin in October.
About the Author
|Ann Over has worked at NASA for twenty-nine years on a variety of spaceflight projects. Most recently she was the project manager for the SCaN Testbed and is now a supervisor of other project managers. She is certified at the senior/expert level for the Office of Management and Budget Federal Acquisition Certification Program/Project Management.|