Failures and Errors
Therac-25 Case Study
Increasing Reliability and Safety
Dependence, Risk, and Progress
100

What is the “state of stress caused by having no access to or being unable to use one’s mobile phone”?

Nomophobia

100

What was the primary function of the Therac-25, and what fatal flaw in its design led to catastrophic consequences?

The Therac-25 was a software-controlled radiation therapy machine used to treat cancer patients. The fatal flaw in its design was the elimination of hardware safety mechanisms present in its predecessors, leading to inadequate safeguards against radiation overdoses.

100

This iterative methodology, popular in software development, focuses on user stories and adapting quickly to changing requirements.

What is Agile?

100

Which of the following for organizations would be least concerned about a system failure that left them down for a couple of hours: United Airlines, Instagram, Citi Bank, VUMC

Instagram

200

What car company is credited with inventing the “crumple zone”?

Mercedes Benz

200

Investigators identified specific software errors in the Therac-25, leading to overdoses. Can you name one of these critical software bugs and describe its impact on patient safety?

One critical software bug was an overflow in a flag variable used in the Set-Up Test procedure. This bug allowed the electron beam to be activated without proper safety measures, contributing to radiation overdoses in patients.

200

An example of one of two space shuttle disasters illustrate the importance of prioritizing safety and addressing known risks in technology.

What is Challenger/Columbia?

200

If someone receives a water bill for $350,000 that was automatically sent by a computer system, what type of failure would that be?

Problem or failure for individual

300

What's one crucial step that computer professionals should take to proactively mitigate the risk of computer failures?

Many possible answers - Assess risks carefully, include safety precautions, make appropriate backup plans for when a system fails, etc.

300

How did the manufacturer, Atomic Energy of Canada, Ltd. (AECL), respond to the incidents involving the Therac-25? Did their actions contribute to resolving the issues or exacerbate the problems?

AECL responded inadequately to the incidents. They provided minimal documentation, failed to conduct sufficient testing, and demonstrated overconfidence in the software's safety. Their actions, including the removal of hardware safety mechanisms, exacerbated the problems rather than resolving them.

300

This system, used in aviation to prevent mid-air collisions, highlights the balance between human judgment and computer control in crisis situations

What is a Traffic Collision Avoidance System (TCAS)

300

A bank has a software error, and customers cannot access their accounts for 2 hours, what type of failure would that be?

System wide failure

400

Following the integration of technology into the workplace, what was the on-the-job fatality rate per 100,000 workers?

3.3

400

 Overconfidence played a role in the Therac-25 incidents. What safety measures did AECL eliminate due to overconfidence, and how did this decision impact the machine's operation?

Due to overconfidence, AECL eliminated hardware safety mechanisms that were present in the earlier Therac models. This decision allowed for the operation of the machine without essential safety interlocks, contributing to the occurrence of radiation overdoses.

400

An example of one of the discussed techniques in software reliability.

Risk Management, Communication, Redundancy, Testing, or User Interfaces

400

A software program that regulates traffic fails due to damage to a camera, what risk factor is this associated with?

Interacting with the real world

500

Give one example of how the use of a new technology caused us to forget the old method of performing a task.

Spell check

500

The Therac-25 had a specific feature, part of the software, which contributed to overdoses. What was this feature, and how did its malfunction lead to patient harm?

One crucial feature was the Set-Up Test procedure, a software routine that performed checks to ensure the machine was ready for treatment. An intricate detail was the use of a flag variable to indicate the readiness of a specific device. An overflow bug in this flag variable, stored in a small unit of memory, led to its unintended reset to zero. This malfunction, combined with the absence of proper safety checks, allowed the electron beam to be activated without the necessary protective measures, resulting in radiation overdoses for some patients.

500

This legal aspect faces challenges in setting liability standards for complex systems like software, often leading to controversial outcomes

What is liability law for software?

500

A company spends $2 million on a new inventory system before realizing that the change is unfeasible and impractical, this inventory system is

An Abandoned System