Spaceflight Now STS-100

Serious computer problem strikes space station

Updated: April 25, 2001 at 6:20 p.m. and 8:20 p.m. EDT

FLASH: Mission Control was able to shut down computer No. 3 at 10:10 p.m. EDT, which brought No. 2 online. Some telemetry is now being received. See our status center for breaking news.

The Spacelab pallet will spend another night dangling on the end of Canadarm2 after computer troubles delayed Wednesday's testing of the new space crane. Photo: Spaceflight Now/NASA TV
The international space station's three main control computers were crippled today by a subtle and so-far-baffling software glitch that disrupted normal operations, forced the crew to delay critical robot arm tests and triggered a massive troubleshooting effort in Houston.

The station's life support systems continue to operate normally, its huge U.S. solar arrays are tracking the sun and generating power as required and the lab's crew is not in any immediate danger.

But flight controllers are not receiving telemetry from the command and control computers, they have no insight into what might have knocked them out of action or what might be required to correct the problem.

Station flight director Mark Ferring told the crew late today engineers are working around the clock to resolve the problem, if possible, by crew wakeup Thursday morning.

"It doesn't sound like you're going to get much sleep tonight," station astronaut Susan Helms replied.

"No, and I think you can rest assured that anyone who knows anything about a computer is now at JSC here and we're all working hard on it," Ferring said.

Overnight, engineers will attempt to uplink commands into the station's command and control - C&C - computer system to diagnose what went wrong in the first place.

But in the system's current condition, the station's Ku-band antenna system is not able to track NASA's communications satellites, making it difficult to uplink the necessary commands.

"What we're going to be doing is looking and seeing if we can re-establish command capability," Ferring told Helms. "We have a good forward link with the UHF but we haven't used that before so we've kind of resisted attempting to command through that."

An initial attempt using the UHF system to command a light in the Destiny laboratory module to turn off was unsuccessful. The crew later aimed a TV camera at the light so flight controllers could tell if commands were getting through while the astronauts slept.

On a positive note, the Russian Vozdukh carbon dioxide removal system in the Zvezda command module is back in operation after unexpectedly shutting down earlier today. The Vozdukh has been acting erratically in recent days and Russian engineers have been debating a plan to have the crew replace its central control computer.

Station commander Yuri Usachev coaxed the air scrubber back into action this afternoon by turning it off and back on again. Engineers may still opt to replace the controller but for now, the Vozdukh is working.

The computer problem is more complex. And potentially serious.

During normal operations, one of the three C&C computers, known as a multiplexer/demultiplexer, operates as the "prime" machine, allowing station astronauts and ground controllers to send commands to various systems and providing critical telemetry.

A second C&C computer operates in backup mode, ready to take over if the prime computer suffers a problem, and the third machine operates in standby in a domino-like software architecture.

Just before 9 p.m. Tuesday, about 40 minutes after the station crew went to bed, command and control computer No. 1 suddenly shut down. As expected, C&C-2 then assumed the role as prime computer while C&C-3 switched from standy into backup mode.

C&C-2 initially appeared to be working normally. But this morning, the computer began having problems loading data from an internal hard drive. After troubleshooting the problem, flight controllers decided to switch C&C-2 into backup mode, which caused C&C-3 to take over as the prime computer in the chain.

To the surprise of engineers on the ground, C&C-3 ran into problems accessing its hard drive and shortly thereafter the machine stopped providing telemetry.

"It's almost like being in a simulation," said mission operations representative Milt Heflin. "We're into something here we don't understand. ... They think it's probably software related. But that's going to be part of the troubleshooting process."

Station flight director Andrew Algate told reporters later the problem almost certainly is due to a software conflict of some sort because all three computers were affected by a similar problem.

The station and shuttle astronauts spent Wednesday unloading the Italian-made Raffaello cargo module seen here attached to the station's Unity node. Photo: Spaceflight Now/NASA TV
He said he was confident engineers will figure out a solution and refused to speculate on what might happen if they are not successful.

Endeavour's crew is scheduled to undock Saturday. That same day, the Russians plan to launch a fresh Soyuz spacecraft to the station. The three-man crew, including millionaire space tourist Dennis Tito, is scheduled to dock with the outpost Monday.

Algate would not address what possible impact the computer problem might have on the Soyuz launch, saying only that it posed no immediate threat to the station's current crew.

Systems in the Russian modules of the space station continue to operate normally and in a worst-case scenario, he said, all three crew members could simply retreat to the Russian segment while engineers continue troubleshooting. But he said he does not believe it will come to that.

Mike Rodriggs, a software engineer at the Johnson Space Center, said engineers hope to uplink commands later in the evening that will cause the computer that originally failed, C&C-1, to switch back into primary mode. Switching in this fashion may recover use of the machine.

If all else fails, engineers may elect to reboot one of the machines. But that is a last resort because rebooting would erase any data that might explain what caused the original problem.

In the meantime, NASA managers are holding open the option of extending Endeavour's mission a day, depending on how the computer troubleshooting goes, to give the crew time to complete tests and checkout of the station's newly installed robot arm.

If the computer glitch is corrected overnight, the arm will be put through its paces Thursday, handing a 3,000-pound cargo pallet back to the shuttle robot arm for reberthing in Endeavour's cargo bay. The Canadarm 2 crane also will be put through a dry run of the maneuvers it will need to make in June during installation of the station's main airlock.

If the computer problem persists, the airlock installation dry run will be deferred and a minimal set of tests will be carried out Friday.

"We do have the capability for an extension day," said Heflin. "So I think we're going to have to allow the rest of the day to complete (troubleshooting) to see where we get to and see where we are in the morning."

In a late afternoon chat with Helms, Ferring said engineers believe C&C No. 3 is active but "there seems to be some problem with it putting telemetry on the downlink.

"So we're looking at ways of potentially power cycling back to one of the other C&C MDMs and including them so they can load a new set of brains into the computer," he said shortly before 6 p.m. "But right now, we of course have to establish a command link to do those kind of things. So that's going to be our first order of business."

"Well, we're hopeful that you'll have a good, successful evening," Helms said. "Thanks, Mark."

"Thank you very much. And we hope we can get this done so you'll have a successful and fun day with the (robot) arm tomorrow."

Helms then asked Ferring if the engineers had any idea what caused the problem.

"We do not yet understand what the problem is," Ferring said. "We're still scratching our heads."

Status Summary

See the Status Center for full play-by-play coverage.