Engineering Human Performance Optimization in the Workplace Part 1 Help

Use search to quickly locate question answers – open up a search box (ctrl+f), then enter a keyword from the question to navigate you to those terms in the course material

CHAPTER 1: INTRODUCTION TO HUMAN PERFORMANCE

OVERVIEW

In its simplest form, human performance is a series of behaviors carried out to accomplish specific task objectives (results). Behavior is what people do and say—it is a means to an end. Behaviors are observable acts that can be seen and heard. In most organizations the behaviors of operators, technicians, maintenance crafts, scientists and engineers, waste handlers, and a myriad of other professionals are aggregated into cumulative acts designed to achieve mission objectives. The primary objective of the operating facilities is the continuous safe, reliable, and efficient production of mission-specific products. At the national laboratories, the primary objectives are the ongoing discovery and testing of new materials, the invention of new products, and technological advancement. The storage, handling, reconfiguration, and final repository of the legacy nuclear waste materials, as well as decontamination, decommissioning, and dismantling of old facilities and support operations used to produce America’s nuclear defense capabilities during the Cold War are other significant mission objectives. Improving human performance is a key in improving the performance of production facilities, performance of the national laboratories, and performance of cleanup and restoration.

It is not easy to anticipate exactly how trivial conditions can influence individual performance. Error-provoking aspects of facility design, procedures, processes, and human nature exist everywhere. No matter how efficiently equipment functions; how good the training, supervision, and procedures; and how well the best worker, engineer, or manager performs his or her duties, people cannot perform better than the organization supporting them. Human error is caused not only by normal human fallibility, but also by incompatible management and leadership practices and organizational weaknesses in work processes and values. Therefore, defense-in-depth with respect to the human element is needed to improve the resilience of programmatic systems and to drive down human error and events.

The aviation industry, medical industry, commercial nuclear power industry, U.S. Navy, DOE and its contractors, and other high-risk, technologically complex organizations have adopted human performance principles, concepts, and practices to consciously reduce human error and bolster controls in order to reduce accidents and events. However, performance improvement is not limited to safety. Organizations that have adopted human performance improvement (HPI) methods and practices also report improved product quality, efficiency, and productivity. HPI, as described in this course and practiced in the field, is not so much a program as it is a distinct way of thinking. This course seeks to improve understanding about human performance and to set forth recommendations on how to manage it and improve it to prevent events triggered by human error.

This course promotes a practical way of thinking about hazards and risks to human performance. It explores both the individual and leader behaviors needed to reduce error, as well as improvements needed in organizational processes and values and job-site conditions to better support worker performance. Fundamental knowledge of human and organizational behavior is emphasized so that managers, supervisors, and workers alike can better identify and eliminate error-provoking conditions that can trigger human errors leading to events in processing facilities, laboratories, D & D structures, or anywhere else. Ultimately, the attitudes and practices needed to control these situations include:

  • the will to communicate problems and opportunities to improve;
  • an uneasiness toward the ability to err;
  • an intolerance for error traps that place people and the facility at risk;
  • vigilant situational awareness;
  • rigorous use of error-prevention techniques; and
  • understanding the value of relationships.

INTEGRATED SAFETY MANAGEMENT AND HPI

DOE developed and began implementation of Integrated Safety Management (ISM) in 1996. Since that time, the Department has gained significant experience with its implementation. This experience has shown that the basic framework and substance of the Department’s ISM program remains valid. The experience also shows that substantial variances exist across the complex regarding familiarity with ISM, commitment to implementation, and implementation effectiveness. The experience further shows that more clarity of DOE’s role in effective ISM implementation is needed. Contractors and DOE alike have reported that clearer expectations and additional guidance on annual ISM maintenance and continuous improvement processes are needed.

Since 1996, external organizations that are also performing high-hazard work, such as commercial nuclear organizations, Navy nuclear organizations, National Aeronautics and Space Administration, and others, have also gained significant experience and insight relevant to safety management. The ISM core function of “feedback and improvement” calls for DOE to learn from available feedback and make changes to improve. This concept applies to the ISM program itself. Lessons learned from both internal and external operating experience are reflected in the ISM Manual to update the ISM program. The ISM Manual should be viewed as a natural evolution of the ISM program, using feedback for improvement of the ISM program itself. Two significant sources of external lessons learned have contributed to that Manual: (1) the research and conclusions related to high-reliability organizations (HRO) and (2) the research and conclusions related to the human performance improvement (HPI) initiatives in the commercial nuclear industry, the U.S. Navy, and other organizations. HRO and HPI tenets are very complementary with ISM and serve to extend and clarify the program’s principles and methods.

As part of the ISM revitalization effort, the Department wants to address known opportunities for improvement based on DOE experience and integrate the lessons learned from HRO organizations and HPI implementation into the Department’s existing ISM infrastructure. The Department wants to integrate the ISM core functions, ISM principles, HRO principles, HPI principles and methods, lessons learned, and internal and external best safety practices into a proactive safety culture where:

  • facility operations are recognized for their excellence and high-reliability;
  • everyone accepts responsibility for their own safety and the safety of others;
  • organization systems and processes provide mechanisms to identify systematic weaknesses and assure adequate controls; and
  • continuous learning and improvement are expected and consistently achieved.

The revitalized ISM system is expected to define and drive desired safety behaviors in order to help organizations create world-class safety performance.
In using the tools, processes, and approaches described in this HPI course, it is important to implement them within an ISM framework, not as stand-alone programs outside of the ISM framework. These tools cannot compete with ISM, but must support ISM. To the extent that these tools help to clarify and improve implementation of the ISM system, the use of these tools is strongly encouraged. The relationship between these tools and the ISM principles and functions needs to be clearly understood and articulated in ISM system descriptions if these tools impact on ISM implementation. It is also critical that the vocabulary and terminology used to apply these tools be aligned with that of ISM. Learning organizations borrow best practices whenever possible, but they must be translated into terms that are consistent and in alignment with existing frameworks.

ISM Guiding Principles

The objective of ISM is to systematically integrate safety into management and work practices at all levels so that work is accomplished while protecting the public, the workers, and the environment. This objective is achieved through effective integration of safety management into all facets of work planning and execution. In other words, the overall management of safety functions and activities becomes an integral part of mission accomplishment.

The seven guiding principles of ISMS are intended to guide organizations actions from development of safety directives to the performance of work. As reflected in the ISM Manual these principles are:

  • Line Management Responsibility For Safety. Line management is directly responsible for the protection of the public, the workers, and the environment.
  • Clear Roles and Responsibilities. Clear and unambiguous lines of authority and responsibility for ensuring safety shall be established and maintained at all organizational levels within the Department and its contractors.
  • Competence Commensurate with Responsibilities. Personnel shall possess the experience, knowledge, skills, and abilities that are necessary to discharge their responsibilities.
  • Balanced Priorities. Resources shall be effectively allocated to address safety, programmatic, and operational considerations. Protecting the public, the workers, and the environment shall be a priority whenever activities are planned and performed.
  • Identification of Safety Standards and Requirements. Before work is performed, the associated hazards shall be evaluated and an agreed-upon set of safety standards and requirements shall be established which, if properly implemented, will provide adequate assurance that the public, the workers, and the environment are protected from adverse consequences.
  • Hazard Controls Tailored to Work Being Performed. Administrative and engineering controls to prevent and mitigate hazards shall be tailored to the work being performed and associated hazards.
  • Operations Authorization. The conditions and requirements to be satisfied for operations to be initiated and conducted shall be clearly established and agreed upon.

ISM Core Functions

Five ISM core functions provide the necessary safety management structure to support any work activity that could potentially affect the public, workers, and the environment. These functions are applied as a continuous cycle with the degree of rigor appropriate to address the type of work activity and the hazards involved.

  • Define the Scope of Work. Missions are translated into work; expectations are set; tasks are identified and prioritized; and resources are allocated.
  • Analyze the Hazards. Hazards associated with the work are identified, analyzed, and categorized.
  • Develop and Implement Hazard Controls. Applicable standards and requirements are identified and agreed-upon; controls to prevent or mitigate hazards are identified; the safety envelope is established; and controls are implemented.
  • Perform Work within Controls. Readiness to do the work is confirmed and work is carried out safely.
  • Provide Feedback and Continuous Improvement. Feedback information on the adequacy of controls is gathered; opportunities for improving how work is defined and planned are identified and implemented; line and independent oversight is conducted; and, if necessary, regulatory enforcement actions occur.

Human error can have a negative affect at each stage of the ISM work cycle; for example:.

  1. Define Work Scope: Errors in defining work can lead to mistakes in analyzing hazards.
  2. Analyze Hazards: Without the correct hazards identified, errors will be made in identifying adequate controls.
  3. Develop Controls: Without an effective set of controls, minor work errors can lead to significant events.
  4. Perform Work: If the response to the event only focuses on the minor work error, the other contributing errors will not be addressed.

Integration of ISM and HPI

Work planning and control processes derived from ISM are key opportunities for enhancement by application of HPI concepts and tools. In fact, an almost natural integration can occur when the HPI objectives—reducing error and strengthening controls – are used as integral to implementing the ISM core functions. Likewise the analytical work that goes into reducing human error and strengthening controls supports the ISM core functions.

For purposes of this course, a few examples of this integration are illustrated in the following table. The ISM core functions are listed in the left column going down the table. The HPI objectives appear as headers in the second and third column on the table.

Integration of ISM and HPI

As illustrated in the table above, the integration of HPI methods and techniques to reduce error and manage controls supports the ISMS core functions.

The following leadership behaviors support ISMS Guiding Principle 1—line management responsibility for safety.

  • Facilitate open communication.
  • Promote teamwork.
  • Reinforce desired behaviors.
  • Eliminate latent organizational weaknesses.
  • Value prevention of errors.
Perspective on Human Performance and Events

The graphic below illustrates what is known about the role of human performance in causing events. About 80 percent of all events are attributed to human error. In some industries, this number is closer to 90 percent. Roughly 20 percent of events involve equipment failures. When the 80 percent human error is broken down further, it reveals that the majority of errors associated with events stem from latent organizational weaknesses (perpetrated by humans in the past that lie dormant in the system), whereas about 30 percent are caused by the individual worker touching the equipment and systems in the facility. Clearly, focusing efforts on reducing human error will reduce the likelihood of events.

An analysis of significant events in the commercial nuclear power industry between 1995 and 1999 indicated that three of every four events were attributed to human error, as reported by INPO. Additionally, a Nuclear Regulatory Commission review of events in which fuel was damaged while in the reactor showed that human error was a common factor in 21 of 26 (81 percent) events. The report disclosed that “the risk is in the people—the way they are trained, their level of professionalism and performance, and the way they are managed.” Human error leading to adverse consequences can be very costly: it jeopardizes an organization’s ability to protect its workforce, its physical facility, the public, and the environment from calamity. Human error also affects the economic bottom line. Very few organizations can sustain the costs associated with a major accident (such as, product, material and facility damage, tool and equipment damage, legal costs, emergency supplies, clearing the site, production delays, overtime work, investigation time, supervisors’ time diverted, cost of panels of inquiry). It should be noted that costs to operations are also incurred from errors by those performing security, work control, cost and schedule, procurement, quality assurance, and other essential but non-safety-related tasks. Human performance remains a significant factor for management attention, not only from a safety perspective, but also from a financial one.
A traditional belief is that human performance is a worker-focused phenomenon. This belief promotes the notion that failures are introduced to the system only through the inherent unreliability of people—Once we can rid ourselves of a few bad performers, everything will be fine. There is nothing wrong with the system. However, experience indicates that weaknesses in organizational processes and cultural values are involved in the majority of facility events. Accidents result from a combination of factors, many of which are beyond the control of the worker. Therefore, the organizational context of human performance is an important consideration. Event-free performance requires an integrated view of human performance from those who attempt to achieve it; that is, how well management, staff, supervision, and workers function as a team and the degree of alignment of processes and values in achieving the facility’s economic and safety missions.

Human Performance for Engineers and Knowledge Workers

Engineers and other knowledge-based workers contribute differently than first-line workers to facility events. A recent study completed for the Nuclear Regulatory Commission (NRC) by the Idaho National Engineering and Environmental Laboratory (INEEL) indicates that human error continues to be a causal factor in 79 percent of industry licensee events. Within those events, there were four latent failures (undetected conditions that did not achieve the desired end(s) for every active failure. More significantly, design and design change problems were a factor in 81 percent of the events involving human error. Recognizing that engineers and other knowledge-based workers make different errors, INPO developed a set of tools specific to their needs.
With engineers, specifically, the errors made can become significant if not caught early. As noted in research conducted at one DOE site, because engineers as a group are highly educated, narrowly focused, and have personalities that tend to be introverted and task-oriented, they tend to be critical of others, but not self-critical.10 If they are not self-critical, their errors may go undetected for long periods of time, sometimes years. This means that it is unlikely that the engineer who made the mistake would ever know that one had been made, and the opportunity for learning is diminished. Thus, human performance techniques aimed at this group of workers need to be more focused on the errors they make while in the knowledge-based performance mode as described in Chapter 2.

The Work Place

The work place or job site is any location where either the physical plant or the “paper” plant (the aggregate of all the documentation that helps control the configuration of the physical plant) can be changed. The systems, structures, and components used in the production processes make up the physical plant. Error can come from either the industrial plant or the paper plant. All human activity involves the risk of error. Flaws in the paper plant can lie dormant and can lead to undesirable outcomes in the physical plant or even personal injury. Front-line workers “touch” the physical plant as they perform their assigned tasks. Supervisors observe, direct, and coach workers. Engineers and other technical staff perform activities that alter the paper plant or modify processes and procedures that direct the activities of workers in the physical plant. Managers influence worker and staff behavior by their oral or written directives and personal example. The activities of all these individuals need to be controlled.

Individuals, Leaders, and Organizations

This course describes how individuals, leaders, and the organization as a whole influence human performance. The role of the individual in human performance is discussed in Chapter 2, “Reducing Error.” The role of the organization is discussed in Chapter 3, “Managing Controls.” The role of the leader, as well as the leader’s responsibilities for excellence in human performance, is discussed in Chapter 4, “Culture and Leadership”. The following provides a general description of each of theses entities:

  • Individual — An employee in any position in the organization from yesterday’s new hire in the storeroom to the senior vice president in the corner office.
  • Leader — Any individual who takes personal responsibility for his or her performance and the facility’s performance and attempts to positively influence the processes and values of the organization. Managers and supervisors are in positions of responsibility and as such are organizational leaders. Some individuals in these positions, however, may not exhibit leadership behaviors that support this definition of a leader. Workers, although not in managerial positions of responsibility, can be and are very influential leaders. The designation as a leader is earned from subordinates, peers, and superiors.
  • Organization — A group of people with a shared mission, resources, and plans to direct people’s behavior toward safe and reliable operation. Organizations direct people’s behavior in a predictable way, usually through processes and its value and belief systems. Workers, supervisors, support staff, managers, and executives all make up the organization.

HUMAN PERFORMANCE

What is human performance? Because most people cannot effectively manage what they do not understand, this question is a good place to start. Understanding the answer helps explain why improvement efforts focus not only on results, but also on behavior. Good results can be achieved with questionable behavior. In contrast, bad results can be produced despite compliant behavior, as in the case of following procedures written incorrectly. Very simply, human performance is behavior plus results (P = B + R).

Behavior

Behavior is what people do and say—a means to an end. Behavior is an observable act that can be seen and heard, and it can be measured. Consistent behavior is necessary for consistent results. For example, a youth baseball coach cannot just shout at a 10-year old pitcher from the dugout to “throw strikes.” The child may not know how and will become frustrated. To be effective, the coach must teach specific techniques—behaviors—that will help the child throw strikes more consistently. This is followed up with effective coaching and positive reinforcement. Sometimes people will make errors despite their best efforts. Therefore, behavior and its causes are extremely valuable as the signal for improvement efforts to anticipate, prevent, catch, or recover from errors. For long-term, sustained good results, a close observation must be conducted of what influences behavior, what motivates it, what provokes it, what shapes it, what inhibits it, and what directs it, especially when handling facility equipment.

Results

Performance infers measurable results. Results, good or bad, are the outcomes of behavior encompassing the mental processes and physical efforts to perform a task. Within an organization, the “end” is that set of outcomes manifested by people’s health and well-being; the environment; the safe, reliable, and efficient production of defense products; the discovery of new scientific knowledge; the invention and testing of new products; and the disposition of legacy wastes and facilities. Events usually involve such things as challenges to reactor safety (where applicable), industrial/radiological safety, environmental safety, quality, reliability, and productivity. Event-free performance is the desired result, but is dependent on reducing error, both where people touch the facility and where they touch the paper (procedures, instructions, drawings, specifications, and the like). Event-free performance is also dependent on ensuring the integrity of controls, controls, barriers, and safeguards against the residual errors that still occur.

ANATOMY OF AN EVENT

Typically, events are triggered by human action. In most cases, the human action causing the event was in error. However, the action could have been directed by a procedure; or it could have resulted from a violation—a shortcut to get the job done. In any case, an act initiates the undesired consequences. The graphic below provides an illustration of the elements that exist before a typical event occurs. Breaking the linkages may prevent events.

Anatomy of an Event
Event

An event, as defined earlier, is an unwanted, undesirable change in the state of facility structures, systems, or components or human/organizational conditions (health, behavior, administrative controls, environment, etc.) that exceeds established significance criteria. Events involve serious degradation or termination of the equipment’s ability to perform its required function. Other definitions include: an outcome that must be undone; any facility condition that does not achieve its goals; any undesirable consequence; and a difference between what is and what ought to be.

Initiating Action

The initiating action is an action by an individual, either correct, in error, or in violation, that results in a facility event. An error is an action that unintentionally departs from an expected behavior. A violation is a deliberate, intentional act to evade a known policy or procedure requirement and that deviates from sanctioned organizational practices. Active errors are those errors that have immediate, observable, undesirable outcomes and can be either acts of commission or omission. The majority of initiating actions are active errors. Therefore, a strategic approach to preventing events should include the anticipation and prevention of active errors.

Flawed Controls

Flawed controls are defects that, under the right circumstances, may inhibit the ability of defensive measures to protect facility equipment or people against hazards or fail to prevent the occurrence of active errors. Controls or barriers are methods that:

  • protect against various hazards (such as radiation, chemical, heat);
  • mitigate the consequences of the hazard (for example, reduced operating safety margin, personal injury, equipment damage, environmental contamination, cost); and
  • promote consistent behavior.

When an event occurs, there is either a flaw with existing controls or appropriate controls are not in place.

Error Precursors

Error precursors are unfavorable prior conditions at the job site that increase the probability for error during a specific action; that is, error-likely situations. An error-likely situation—an error about to happen—typically exists when the demands of the task exceed the capabilities of the individual or when work conditions aggravate the limitations of human nature. Error-likely situations are also known as error traps.

Latent Organizational Weaknesses

Latent organizational weaknesses are hidden deficiencies in management control processes (for example, strategy, policies, work control, training, and resource allocation) or values (shared beliefs, attitudes, norms, and assumptions) that create work place conditions that can provoke errors (precursors) and degrade the integrity of controls (flawed controls). Latent organizational weaknesses include system-level weaknesses that may exist in procedure development and review, engineering design and approval, procurement and product receipt inspection, training and qualification system(s), and so on. The decisions and activities of managers and supervisors determine what is done, how well it is done, and when it is done, either contributing to the health of the system(s) or further weakening its resistance to error and events. System-level weaknesses are aggregately referred to as latent organizational weaknesses. Consequently, managers and supervisors should perform their duties with the same uneasy respect for error-prone work environments as workers. A second strategic thrust to preventing events should be the identification and elimination of latent organizational weaknesses.

STRATEGIC APPROACH FOR HUMAN PERFORMANCE

The strategic approach to improving human performance within an organization’s community embraces two primary challenges:

  1. Anticipate, prevent, catch, and recover from active errors at the job site.
  2. Identify and eliminate latent organizational weaknesses that provoke human error and degrade controls against error and the consequences of error.

If opportunities to err are not methodically identified, preventable errors will not be eliminated. Even if opportunities to err are systematically identified and prevented, people may still err in unanticipated and creative ways. Consequently, additional means are necessary to protect against errors that are not prevented or anticipated. Reducing the error rate minimizes the frequency, but not the severity of events. Only controls can be effective at reducing the severity of the outcome of error. Defense-in-depth—controls, or safeguards arranged in a layered fashion—provides assurance such that if one fails, remaining controls will function as needed to reduce the impact on the physical facility.

To improve human performance and facility performance, efforts should be made to (1) reduce the occurrence of errors at all levels of the organization and (2) enhance the integrity of controls, or safeguards discovered to be weak or missing. Reducing errors (Re) and managing controls (Mc) will lead to zero significant events (ØE). The formula for achieving this goal is Re + Mc → ØE. Eliminating significant facility events will result in performance improvement within the organization.

Reducing Error

An effective error-reduction strategy focuses on work execution because these occasions present workers with opportunities to harm key assets, reduce productivity, and adversely affect quality through human error. Work execution involves workers having direct contact with the facility, when they touch equipment and when knowledge workers touch the paper that influences the facility (procedures, instructions, drawings, specifications, etc.). During work execution, the human performance objective is to anticipate, prevent, or catch active errors, especially at critical steps, where error-free performance is absolutely necessary. While various work planning taxonomies may be used, opportunities for reducing error are particularly prevalent in what is herein expressed as preparation, performing and feedback.

  • Preparationplanning – identifying the scope of work, associated hazards, and what is to be avoided, including critical steps; job-site reviews and walkdowns – identifying potential job-site challenges to error-free performance; task assignment – putting the right people on the job in light of the job’s task demands; and task previews and pre-job briefings – identifying the scope of work including critical steps, associated hazards, and what has to be avoided by anticipating possible active errors and their consequences.
  • Performance — performing work with a sense of uneasiness; maintaining situational awareness; rigorous use of human performance tools for important human actions, avoiding unsafe or at-risk work practices; supported with quality supervision and teamwork.
  • Feedbackreporting – conveying information on the quality of work preparation, related resources, and work place conditions to supervision and management; behavior observations – workers receiving coaching and reinforcement on their performance in the field through observations by managers and supervisors.

Chapter 2 focuses more on anticipating, preventing, and catching human error at the job-site.

Managing Controls

Events involve breaches in controls or safeguards. As mentioned earlier, errors still occur even when opportunities to err are systematically identified and eliminated. It is essential therefore that management take an aggressive approach to ensure controls function as intended. The top priority to ensure safety from human error is to identify, assess, and eliminate hazards in the workplace. These hazards are most often closely related to vulnerabilities with controls. They have to be found and corrected. The most important aspect of this strategy is an assertive and ongoing verification and validation of the health of controls. Ongoing self-assessments are employed to scrutinize controls, and then the vulnerabilities are mended using the corrective action program. A number of taxonomies of safety controls have been developed. For purposes of this discussion of the HPI strategic approach, four general types of controls are reviewed in brief.

  • Control of Hazards by elimination or substitution – Organizations evaluate operations, procedures and facilities to identify workplace hazards. Management implements a hazard prevention and elimination process. When hazards are identified in the workplace they are prioritized and actions are taken based on risks to the workers. Management puts in place protective measures until such time as the hazard(s) can be eliminated. An assessment of the hazard control(s) is carried out to verify that the actions taken to eliminate the hazard are effective and enduring.
  • Engineered features— These provide the facility with the physical ability to protect from errors. To optimize this set of controls , equipment is reliable and is kept in a configuration that is resistant to simple human error and allows systems and components to perform their intended functions when required. Facilities with high equipment reliability, effective configuration control, and minimum human-machine vulnerabilities tend to experience fewer and less severe facility events than those that struggle with these issues. How carefully facility equipment is designed, operated, and maintained (using human-centered approaches) affects the level of integrity of this line of protection.
  • Administrative provisions— Policies, procedures, training, work practices processes, administrative controls and expectations direct people’s activities so that they are predictable and safe and limit their exposure to hazards, especially for work performed in and on the facility. All together such controls help people anticipate and prepare for problems. Written instructions specify what, when, where, and how work is to be done and what personal protective equipment workers are to use. The rigor with which people follow and perform work activities according to correctly written procedures, expectations, and standards directly affects the integrity of this line of protection.
  • Cultural norms — These are the assumptions, values, beliefs, and attitudes and the related leadership practices that encourage either high standards of performance or mediocrity, open or closed communication, or high or low standards of performance. Personnel in highly reliable organizations practice error-prevention rigorously, regardless of their perception of a task’s risk and simplicity, how routine it is, and how competent the performer. The integrity of this line of defense depends on people’s appreciation of the human’s role in safety, the respect they have for each other, and their pride in the organization and the facility.
  • Oversight — Accountability for personnel and facility safety, for security, and for ethical behavior in all facets of facility operations, maintenance, and support activities is achieved by a kind of “social contract” entered into willingly by workers and management where a “just culture” prevails. In a just culture, people who make honest errors and mistakes are not blamed while those who willfully violate standards and expectations are censured. Workers willingly accept responsibility for the consequences of their actions, including the rewards or sanctions (see “accountability” in the glossary). They feel empowered to report errors and near misses. This accountability helps verify margins, the integrity of controls and processes, as well as the quality of performance. Performance improvement activities facilitate the accountability of line managers through structured and ongoing assessments of human performance, trending, field observations, and use of the corrective action program, among others. The integrity of this line of defense depends on management’s commitment to high levels of human performance and consistent follow-through to correct problems and vulnerabilities.

Chapter 3 focuses on controls and their management. Chapter 4 emphasizes the role managers and informal leaders play in shaping safety culture.

PRINCIPLES OF HUMAN PERFORMANCE

Five simple statements, listed below, are referred to as the principles or underlying truths of human performance. Excellence in human performance can only be realized when individuals at all levels of the organization accept these principles and embrace concepts and practices that support them. These principles are the foundation blocks for the behaviors described and promoted in this course. Integrating these principles into management and leadership practices, worker practices, and the organization’s processes and values will be instrumental in developing a working philosophy and implementing strategies for improving human performance within your organization.

  1. People are fallible, and even the best people make mistakes.
    Error is universal. No one is immune regardless of age, experience, or educational level. The saying, “to err is human,” is indeed a truism. It is human nature to be imprecise—to err. Consequently, error will happen. No amount of counseling, training, or motivation can alter a person’s fallibility. Dr. James Reason, author of Human Error (1990) wrote: It is crucial that personnel and particularly their managers become more aware of the human potential for errors, the task, workplace, and organizational factors that shape their likelihood and their consequences. Understanding how and why unsafe acts occur is the essential first step in effective error management.
  2. Error-likely situations are predictable, manageable, and preventable.
    Despite the inevitability of human error in general, specific errors are preventable. Just as we can predict that a person writing a personal check at the beginning of a new year stands a good chance of writing the previous year on the check, a similar prediction can be made within the context of work at the job site. Recognizing error traps and actively communicating these hazards to others proactively manages situations and prevents the occurrence of error. By changing the work situation to prevent, remove, or minimize the presence of conditions that provoke error, task and individual factors at the job site can be managed to prevent, or at least minimize, the chance for error.
  3. Individual behavior is influenced by organizational processes and values. Organizations are goal-directed and, as such, their processes and values are developed to direct the behavior of the individuals in the organization. The organization mirrors the sum of the ways work is divided into distinct jobs and then coordinated to conduct work and generate deliverables safely and reliably. Management is in the business of directing workers’ behaviors. Historically, management of human performance has focused on the “individual error-prone or apathetic workers.” Work is achieved, however, within the context of the organizational processes, culture, and management planning and control systems. It is exactly these phenomena that contribute most of the causes of human performance problems and resulting facility events.
  4. People achieve high levels of performance because of the encouragement and reinforcement received from leaders, peers, and subordinates.
    The organization is perfectly tuned to get the performance it receives from the workforce. All human behavior, good and bad, is reinforced, whether by immediate consequences or by past experience. A behavior is reinforced by the consequences that an individual experiences when the behavior occurs. The level of safety and reliability of a facility is directly dependent on the behavior of people. Further, human performance is a function of behavior. Because behavior is influenced by the consequences workers experience, what happens to workers when they exhibit certain behaviors is an important factor in improving human performance. Positive and immediate reinforcement for expected behaviors is ideal.
  5. Events can be avoided through an understanding of the reasons mistakes occur and application of the lessons learned from past events (or errors).
    Traditionally, improvement in human performance has resulted from corrective actions derived from an analysis of facility events and problem reports—a method that reacts to what happened in the past. Learning from our mistakes and the mistakes of others is reactive—after the fact—but important for continuous improvement. Human performance improvement today requires a combination of both proactive and reactive approaches. Anticipating how an event or error can be prevented is proactive and is a more cost-effective means of preventing events and problems from developing.

Chapter 2: REDUCING ERROR

INTRODUCTION

As capable and ingenious as humans are, we do err and make mistakes. It is precisely this part of human nature that we want to explore in the first part of this chapter. This inquiry includes discussions of human characteristics, unsafe attitudes, and at-risk behaviors that make people vulnerable to errors. A better understanding of what is behind the first principle of human performance, “people are fallible, even the best make mistakes,” will help us better compensate for human error through more rigorous use of error-reduction tools and by improving controls. Certain sections of this chapter are considered essential reading and will be flagged as such at the beginning of the section.

HUMAN FALLIBILITY (Essential Reading)

Human nature encompasses all the physical, biological, social, mental, and emotional characteristics that define human tendencies, abilities, and limitations. One of the innate characteristics of human nature is imprecision. Unlike a machine that is precise—each time, every time—people are imprecise, especially in certain situations. For instance, humans tend to perform poorly under high stress and time pressure. Because of “fallibility,” human beings are vulnerable to external conditions that cause them to exceed the limitations of human nature. Vulnerability to such conditions makes people susceptible to error. Susceptibility to error is augmented when people work within complex systems (hardware or administrative) that have concealed weaknesses—latent conditions that either provoke error or weaken controls against the consequences of error.

Common Traps of Human Nature

People tend to overestimate their ability to maintain control when they are doing work. Maintaining control means that everything happens that is supposed to happen during performance of a task and nothing else. There are two reasons for this overestimation of ability. First, consequential error is rare. Most of the time when errors occur, little or nothing happens. So, people reason that errors will be caught or won’t be consequential. Second, there is a general lack of appreciation of the limits of human capabilities. For instance, many people have learned to function on insufficient rest or to work in the presence of enormous distractions or wretched environmental conditions (extreme heat, cold, noise, vibration, and so on). These conditions become normalized and accepted by the individual. But, when the limits of human capabilities are exceeded (fatigue or loss of situational awareness, for example), the likelihood of error increases. The common characteristics of human nature addressed below are especially accentuated when work is performed in a complex work environment.

Stress. Stress in itself is not a bad thing. Some stress is normal and healthy. Stress may result in more focused attention, which in some situations could actually be beneficial to performance. The problem with stress is that it can accumulate and overpower a person, thus becoming detrimental to performance. Stress can be seen as the body’s mental and physical response to a perceived threat(s) in the environment. The important word is perceived; the perception one has about his or her ability to cope with the threat. Stress increases as familiarity with a situation decreases. It can result in panic, inhibiting the ability to effectively sense, perceive, recall, think, or act. Anxiety and fear usually follow when an individual feels unable to respond successfully. Along with anxiety and fear, memory lapses are among the first symptoms to appear. The inability to think critically or to perform physical acts with accuracy soon follows.

Avoidance of Mental Strain. Humans are reluctant to engage in lengthy concentrated thinking, as it requires high levels of attention for extended periods. Thinking is a slow, laborious process that requires great effort. Consequently, people tend to look for familiar patterns and apply well-tried solutions to a problem. There is the temptation to settle for satisfactory solutions rather than continue seeking a better solution. The mental biases, or shortcuts, often used to reduce mental effort and expedite decision-making include:

  • Assumptions – A condition taken for granted or accepted as true without verification of the facts.
  • Habit – An unconscious pattern of behavior acquired through frequent repetition.
  • Confirmation bias – The reluctance to abandon a current solution—to change one’s mind—in light of conflicting information due to the investment of time and effort in the current solution. This bias orients the mind to “see” evidence that supports the original supposition and to ignore or rationalize away conflicting data.
  • Similarity bias – The tendency to recall solutions from situations that appear similar to those that have proved useful from past experience.
  • Frequency bias – A gamble that a frequently used solution will work; giving greater weight to information that occurs more frequently or is more recent.
  • Availability bias – The tendency to settle on solutions or courses of action that readily come to mind and appear satisfactory; more weight is placed on information that is available (even though it could be wrong). This is related to a tendency to assign a cause-effect relationship between two events because they occur almost at the same time.

Limited Working Memory. The mind’s short-term memory is the “workbench” for problem-solving and decision-making. This temporary, attention-demanding storeroom is used to remember new information and is actively involved during learning, storing, and recalling information. Most people can reliably remember a limited number of items at a time often expressed as 7+1 or -2. The limitations of short-term memory are at the root of forgetfulness; forgetfulness leads to omissions when performing tasks. Applying place-keeping techniques while using complex procedures compensates for this human limitation.

Limited Attention Resources. The limited ability to concentrate on two or more activities challenges the ability to process information needed to solve problems. Studies have shown that the mind can concentrate on, at most, two or three things simultaneously. Attention is a limited commodity—if it is strongly drawn to one particular thing it is necessarily withdrawn from other competing concerns. Humans can only attend to a very small proportion of the available sensory data. Also, preoccupation with some demanding sensory input or distraction by some current thoughts or worries can capture attention. Attention focus (concentration) is hard to sustain for extended periods of time. The ability to concentrate depends very much upon the intrinsic interest of the current object of attention. Self-checking (Stop, Think, Act, Review) is an effective tool for helping individuals maintain attention.

Mind-Set. People tend to focus more on what they want to accomplish (a goal) and less on what needs to be avoided because human beings are primarily goal-oriented by nature. As such, people tend to “see” only what the mind expects, or wants, to see. The human mind seeks order, and, once established, it ignores anything outside that mental model. Information that does not fit a mind-set may not be noticed; hence people tend to miss conditions and circumstances which are not expected. Likewise because they expect certain conditions and circumstances, they tend to see things that are not really present. A focus on goal tends to conceal hazards, leading to inaccurate perception of risks. Errors, hazards, and consequences usually result from either incomplete information or assumptions. Pre-job briefings, if done mindfully, help people recognize what needs to be avoided as well as what needs to be accomplished.

Difficulty Seeing One’s Own Error. Individuals, especially when working alone, are particularly susceptible to missing errors. People who are too close to a task, or are preoccupied with other things, may fail to detect abnormalities. People are encouraged to “focus on the task at hand.” However, this is a two-edged sword. Because of our tendency for mind-set and our limited perspective, something may be missed. Peer-checking, as well as concurrent and independent verification techniques, help detect errors that an individual can miss. Engineers and some knowledge workers, by the nature of their focus on producing detailed information, can be especially susceptible to not being appropriately self-critical.

Limited Perspective. Humans cannot see all there is to see. The inability of the human mind to perceive all facts pertinent to a decision challenges problem-solving. This is similar to attempting to see all the objects in a locked room through the door’s keyhole. It is technically known as “bounded rationality.” Only parts of a problem receive one’s attention while other parts remain hidden to the individual. This limitation causes an inaccurate mental picture, or model, of a problem and leads to underestimating the risk. A well-practiced problem-solving methodology is a key element to effective operating team performance during a facility abnormality and also for the management team during meetings to address the problems of operating and maintaining the facility.

Susceptibility To Emotional/Social Factors. Anger and embarrassment adversely influence team and individual performance. Problem-solving ability especially in a group may be weakened by these and other emotional obstacles. Pride, embarrassment, and the group may inhibit critical evaluation of proposed solutions, possibly resulting in team errors.
Fatigue. People get tired. In general, Americans are working longer hours now than a generation ago and are sleeping less. Physical, emotional, and mental fatigue can lead to error and poor judgment. Fatigue is affected by both on-the-job demands (production pressures, environment, and reduced staffing) and off-duty life style (diet and sleep habits). Fatigue leads to impaired reasoning and decision-making, impaired vigilance and attention, slowed mental functioning and reaction time, loss of situational awareness, and adoption of shortcuts. Acquiring adequate rest is an important factor in reducing individual error rate.

Presenteeism. Some employees will be present in the need to belong to the workplace despite a diminished capacity to perform their jobs due to illness or injury. The tendency of people to continue working with minor health problems can be exacerbated by lack of sick leave, a backlog of work, or poor access to medical care, and can lead to employees working with significant impairments. Extreme cases can include individuals who fail to seek care for chronic and disabling physical and mental health problems in order to keep working.

Unsafe Attitudes and At-Risk Behaviors

An attitude is a state of mind, or feeling, toward an object or subject. Attitudes are influenced by many factors. They are formulated by one’s experiences, by examples and guidance from others, through acquired beliefs and the like. Attitudes can develop as a result of educational experiences, and, in such cases, it can be said that attitudes may be chosen.14 Attitudes can also be acculturated—formulated by one’s experiences and influences from beliefs and behaviors within one’s peer group. For example, the Mohawk Indians, often referred to as “skywalkers,” are renowned for their extraordinary ability to walk high steel beams with balance and grace, seemingly without any fear. It is commonly thought that this absence of fear of height was inborn among the woodland Indians. It seems more likely that the trait was learned.

Anyone can possess an unsafe attitude. Unsafe attitudes are derived from beliefs and assumptions about workplace hazards. Hazards are threats of harm. Harm includes physical damage to equipment, personal injury, and even simple human error. Unsafe attitudes blind people to the precursors to harm (exposure to danger). Notice that hazards are not confined to the industrial facility; they exist in the office facility as well. The unsafe attitudes that are described below are detrimental to excellent human performance and to the physical facility and are usually driven by one’s perception of risk.

People in general are poor judges of risk and commonly underestimate it.

Examples of Risk Behaviors

  • Before the Park Service made it unlawful to feed the bears at Yellowstone National Park, in the summer time a long line of automobiles would be stopped at the side the road where the bears foraged for food in garbage cans. Tourists eagerly fed the animals through open windows—a very risky business. Every day of the year in our larger cities, television news cameras chronicle tragedies that have resulted from someone taking undue risks: trying to beat the train at a railroad crossing; playing with a loaded firearm; swimming in dangerous waters; binge drinking following the “big game”; and the like.

Each individual “decides” what to be afraid of and how afraid he or she should be. People often think of risk in terms of probability, or likelihood, without adequately considering the possible consequences or severity of the outcome. For instance, a mountain climber presumes he will not slip or fall because most people don’t slip and fall when they climb. The climber gives little thought to the consequences of a fall should one occur. (broken bones, immobility, unconsciousness, no quick emergency response) People take the following factors into consideration in varying degrees in assessing the risk of a situation.16 People are less afraid of risks or situations:

  • that they feel they have “control” over;
  • that provide some benefit(s) they want;
  • the more they know about and ”live” with the hazard
  • that they choose to take rather than those imposed on them;
  • that are “routine” in contrast to those that are new or novel;
  • that come from people, places, or organizations they trust;
  • when they are unaware of the hazard(s);
  • that are natural versus those that are man-made; and
  • that affect others.

It has been said that risk perception tends to be guided more by our heart than our head. What feels safe may, in fact, be dangerous. The following unsafe attitudes create danger in the work place. Awareness of these unsafe, detrimental attitudes among the workforce is a first step toward applying error-prevention methods.

  • Pride. An excessively high opinion of one’s ability; arrogance. Being self-focused, pride tends to blind us to the value of what others can provide, hindering teamwork. People with foolish pride think their competence is being called into question when they are corrected about not adhering to expectations. The issue is human fallibility, not their competence. This attitude is evident when someone responds, “Don’t tell me what to do!” As commander of the U.N. forces in Korea in 1950, General Douglas MacArthur (contrary to the President’s strategy) sought to broaden the war against North Korea and China. President Truman and the Joint Chiefs were fearful that MacArthur’s strategy, in opposition to the administration’s “limited” war, could bring the Soviet Union into the war and lead to a possible third world war. In April 1951, Truman fired MacArthur for insubordination. At the Senate Foreign Relations and Armed Services committee hearing on MacArthur’s dismissal, the General would admit to no mistakes, no errors of judgment, and belittled the danger of a larger conflict.
  • Heroic. An exaggerated sense of courage and boldness, like that of General George Armstrong Custer. At Little Big Horn, he was so impetuous and eager for another victory that he ignored advice from his scouts and fellow officers and failed to wait for reinforcements that were forthcoming. He rode straight into battle against an overwhelming force—the Sioux and Cheyenne braves—and to his death with over 200 of his men. Heroic reactions are usually impulsive. The thinking is that something has to be done fast or all is lost. This perspective is characterized by an extreme focus on the goal without consideration of the hazards to avoid.
  • Fatalistic. A defeatist belief that all events are predetermined and inevitable and that nothing can be done to avert fate: “que será, será” (what will be will be) or “let the chips fall as they may.” The long drawn-out trench warfare that held millions of men on the battlefields of northern France in World War I caused excessive fatalistic responses among the ranks of soldiers on both sides of the fighting. “Week after week, month after month, year after year, the same failed offensive strategy prevailed. Attacking infantry forces always faced a protected enemy and devastating machine gun fire. Millions of men killed and wounded, yet the Generals persisted. The cycle continued—over the top, early success, then overwhelming losses and retreat.”
  • Invulnerability. A sense of immunity to error, failure, or injury. Most people do not believe they will err in the next few moments: “That can’t happen to me.” Error is always a surprise when it happens. This is an outcome of the human limitation to accurately estimate risk. The failure to secure enough lifeboats for all passengers and to train the seamen how to launch them and load them ultimately resulted in the biggest maritime loss of civilian lives in history on the Titanic. Invulnerability was so ingrained in the minds of the ship owners about the ship being unsinkable that to supply the vessel with life boats for all passengers was foolhardy and would somehow leave the impression that the ship could sink. Hence, only half enough life boats were brought on board for the maiden voyage. When disaster struck, seaman struggled to launch the available craft. In the panic and confusion, numerous boats floated away from the mother ship only partially loaded.
  • Pollyanna – All is well. People tend to presume that all is normal and perfect in their immediate surroundings Humans seek order in their environment, not disorder. They tend to fill in gaps in perception and to see wholes instead of portions. Consequently, people unconsciously believe that everything will go as planned. This is particularly true when people perform routine activities, unconsciously thinking nothing will go wrong. This belief is characterized with quotes such as “What can go wrong?” or “It’s routine.” This attitude promotes an inaccurate perception of risk and can lead individuals to ignore unusual situations or hazards, potentially causing them to react either too late or not at all.
  • Bald Tire. A belief that past performance is justification for not changing (improving) existing practices or conditions: “I’ve got 60,000 miles on this set of tires and haven’t had a flat yet.” A history of success can promote complacency and overconfidence. Evidence of this attitude is characterized with quotes such as, “We haven’t had any problems in the past,” or “We’ve always done it this way.” Managers can be tempted to ignore recommendations for improvement if results have been good. What happened with the Columbia space shuttle is a good example. Over the course of 22 years, on every flight, some foam covering the outer skin of the external fuel tank fell away during launch and struck the shuttle. Foam strikes were normalized to the point where they were simply viewed as a “maintenance” issue—a concern that did not threaten a mission’s success. In 2003, even after it became clear from the launch videos that foam had struck the Orbiter in a manner never before seen, Space Shuttle Program managers were not unduly alarmed. They could not imagine why anyone would want a photo of something that could be fixed after landing. Learned attitudes about foam strikes diminished management’s wariness of their danger.

At-risk behaviors are actions that involve shortcuts, violations of error-prevention expectations, or simple actions intended to improve efficient performance of a task, usually at some expense of safety. At-risk practices involve a move from safety toward danger. These acts have a higher probability, or potential, of a bad outcome. This does not mean such actions are “dangerous,” or that they should not ever be performed. However, the worker and management should be aware of at-risk practices that occur, under what circumstances, and on which systems. At-risk behavior usually involves taking the path of least effort and is rarely penalized with an event, a personal injury, or even correction from peers or a supervisor. Instead it is consistently reinforced with convenience, comfort, time savings, and, in rare cases, with fun..

Examples of at-risk behaviors on the job

  • hurrying through an activity;
  • following procedures cookbook-style (blind or unthinking compliance);
  • removing several danger tags quickly without annotating removal on the clearance sheet when removed;
  • reading an unrelated document while controlling an unstable system in manual;
  • having one person perform actions at critical steps without peer checking or performing concurrent verification;
  • not following a procedure as required when a task is perceived to be “routine”;
  • attempting to lift too much weight to reduce the number of trips;
  • trying to listen to someone on the telephone and someone else standing nearby (multitasking);
  • signing off several steps of a procedure before performing the actions; or
  • working in an adverse physical environment without adequate protection (such as working on energized equipment near standing water—progress would be slowed to cleanup the water or to get a rubber floor mat).

Risky behaviors at worksites have contributed to events, some causing injury and death, including:

  • working on a hot electrical panel without wearing proper protective clothing;
  • carrying heavy materials on an unstable surface while not using fall protection;
  • failing to adhere to safety precautions when using a laser;
  • operating a forklift in a reckless manner;
  • opening a hazardous materials storage tank without knowing the contents; and
  • failing to follow procedures for safeguarding sensitive technical information.

Persistent use of at-risk behaviors builds overconfidence and trust in personal skills and ability. This is a slippery slope, since people foolishly presume they will not err. Without correction, at-risk behaviors can become automatic (skill-based), such as rolling through stop signs at residential intersections. Over the long-term, people will begin to underestimate the risk of hazards and the possibility of error and will consider danger (or error) as more remote. People will become so used to the practice that, under the right circumstances, an event occurs. Managers and supervisors must provide specific feedback when at-risk behavior is observed.

Workers are more likely to avoid at-risk behavior if they know it is unacceptable. Without correction, uneasiness toward equipment manipulations or intolerance of error traps will wane.

Slips, Lapses, Mistakes, Errors and Violations

Error. People do not err intentionally. Error is a human action that unintentionally departs from an expected behavior. Error is behavior without malice or forethought; it is not a result. Human error is provoked by a mismatch between human limitations and environmental conditions at the job site, including inappropriate management and leadership practices and organizational weaknesses that set up the conditions for performance.

Slips occur when the physical action fails to achieve the immediate objective. Lapses involve a failure of one’s memory or recall. Slips and lapses can be classified by type of behavior when it occurs with respect to physical manipulation of facility equipment. The following categories describe how an incorrect or erroneous action can physically manifest itself or ways an action can go wrong:

  • timing – too early, too late, omission;
  • duration – too long, too short;
  • sequence – reversal, repetition, intrusion;
  • object – wrong action on correct object, correct action on wrong object;
  • force – too little or too much force;
  • direction – incorrect direction;
  • speed – too fast or too slow; and
  • distance – too far, too short.

Mistakes, by contrast, occur when a person uses an inadequate plan to achieve the intended outcome. Mistakes usually involve misinterpretations or lack of knowledge.

Active Errors

Active errors are observable, physical actions that change equipment, system, or facility state, resulting in immediate undesired consequences. The key characteristic that makes the error active is the immediate unfavorable result to facility equipment and/or personnel. Front-line workers commit most of the active errors because they touch equipment. Most errors are trivial in nature, resulting in little or no consequence, and may go unnoticed or are easily recovered from. However, grievous errors may result in loss of life, major personal injury, or severe consequences to the physical facility, such as equipment damage. Active errors spawn immediate, unwanted consequences.

Latent Errors

Latent errors result in hidden organization-related weaknesses or equipment flaws that lie dormant. Such errors go unnoticed at the time they occur and have no immediate apparent outcome to the facility or to personnel. Latent conditions include actions, directives, and decisions that either create the preconditions for error or fail to prevent, catch, or mitigate the effects of error on the physical facility. Latent errors typically manifest themselves as degradations in defense mechanisms, such as weaknesses in processes, inefficiencies, and undesirable changes in values and practices. Latent conditions include design defects, manufacturing defects, maintenance failures, clumsy automation, defective tools, training shortcomings, and so on. Managers, supervisors, and technical staff, as well as front-line workers, are capable of creating latent conditions. Inaccuracies become embedded in paper-based directives, such as procedures, policies, drawings, and design bases documentation. Workers unknowingly alter the integrity of physical facility equipment, such as the installation of an incorrect gasket, mispositioning a valve, hanging a danger tag on the wrong component, or attaching an incorrect label.

Usually, there is no immediate feedback that an error has been made. Engineers have performed key calculations incorrectly that slipped past subsequent reviews, invalidating the design basis for safety-related equipment. Craft personnel have undermined equipment performance by installing a sealing mechanism incorrectly, which is not discovered until the equipment is called upon to perform its function. The table below summarizes the general characteristics of each kind of error.

As one can see from the table, latent errors are more subtle and threatening than active errors, making the facility more vulnerable to events triggered by occasional active errors. A study, sponsored by the Nuclear Regulatory Commission (NRC), focused on the human contribution to 35 events that occurred over a 6-year period in the nuclear power industry. Of the 270 errors identified in those events, 81 percent were latent, and 19 percent were active. The NRC study determined that design and design change errors and maintenance errors were the most significant contributors to latent conditions. The latent conditions or errors contributed most often to facility events and caused the greatest increases in risk.

Violations

Violations are characterized as the intentional (with forethought) circumvention of known rules or policy. A violation involves the deliberate deviation or departure from an expected behavior, policy, or procedure. Most violations are well intentioned, arising from a genuine desire to get a job done according to management’s wishes. Such actions may be either acts of omission (not doing something that should be done) or commission (doing something wrong). Usually adverse consequences are unintended—violations are rarely acts of sabotage. The deliberate decision to violate a rule is a motivational or cultural issue. The willingness to violate known rules is generally a function of the accepted practices and values of the immediate work group and its leadership, the individual’s character, or both. In some cases, the individual achieved the desired results wanted by the manager while knowingly violating expectations. Workers, supervisors, managers, engineers, and even executives can be guilty of violations.

Violations are usually adopted for convenience, expedience, or comfort. Events become more likely when someone disregards a safety rule or expectation. A couple of strong situations that tempt a person to do something other than what is expected involve conflicts between goals or the outcome of a previous mistake. The individual typically underestimates the risk, unconsciously assuming he or she will not err, especially in the next few moments. People are generally overconfident about their ability to maintain control.

Examples: When People Commit Violations

Research has found that the following circumstances, in order of influence, prompt a person to violate expectations.

  • low potential for detection
  • absence of authority in the immediate vicinity
  • peer pressure by team or work group
  • emulation of role models (according to the individual concerned)
  • individual’s perception that he or she possesses the authority to change the standard
  • standard is unimportant to management
  • unawareness of potential consequences; perceived low risk
  • competition with other individuals or work groups
  • interferences or obstacles to achieving the work goal
  • conflicting demands or goals forcing the individual to make a choice
  • precedent: “We’ve always done it this way” (tacitly acceptable to authority)

The discussion on violations intends to help clarify the differences between the willful, intentional decision to deviate associated with violations and the unintended deviation from expected behavior associated with error. This course focuses on managing human error.

Dependency and Team Errors

For controls to be reliable, they must be independent; that is, the failure of one does not lead to the failure of another. If the strength of one barrier can be unfavorably influenced by another barrier or condition, they are said to be dependent. Dependency increases the likelihood of human error due to the person’s interaction or relationship with other seemingly independent defense mechanisms. For example, in the rail transportation industry, although a train engineer monitors railway signals during transit, automatic warning signals are built into the transportation system as a backup to the engineer. However, the engineer can become less vigilant by relying on an automatic warning signal to alert him/her to danger on the track ahead. What if the automatic signal fails as a result of improper maintenance intervals? Instead of one barrier left (an alert engineer), no barriers are left to detect a dangerous situation. There are three situations that can cause an unhealthy dependency, potentially defeating the integrity of overlapping controls:

  • Equipment Dependencies – Lack of vigilance due to the assumption that hardware controls or physical safety devices will always work.
  • Team Errors – Lack of vigilance created by the social (interpersonal) interaction between two or more people working together.
  • Personal Dependencies – Unsafe attitudes and traps of human nature leading to complacency and overconfidence.
Equipment Dependencies

When individuals believe that equipment is reliable, they may reduce their level of vigilance or even suspend monitoring of the equipment during operation. Automation, such as level and pressure controls, has the potential to produce such a dependency. Boring tasks and highly repetitive monitoring of equipment over long periods can degrade vigilance or even tempt a person to violate inspection requirements, possibly leading to the falsification of logs or related records. Monitoring tasks completed by a computer can also lead to complacency. In some cases, the worker becomes a “common mode failure” for otherwise independent facility systems, making the same error or assumption about all redundant trains of equipment or components.

Diminishing people’s dependencies on equipment can be addressed by:

  • applying forcing functions and interlocks;
  • eliminating repetitive monitoring of equipment through design modifications;
  • alerting personnel to the failure of warning systems;
  • staggering work activities on redundant equipment at different times or assigning different persons to perform the same task;
  • diversifying types of equipment or components, thereby forcing the use of different practices; for example, for turbine-driven and motor-driven pumps;
  • training people on failure modes of automatic systems and how they are detected;
  • informing people on equipment failure rates; and
  • minimizing the complexity of procedures, tools, instrumentation, and controls.
Team Errors

Just because two or more people are performing a task does not ensure that it will be done correctly. Shortcomings in performance can be triggered by the social interaction between group members. In team situations, workers may not be fully attentive to the task or action because of the influence of coworkers. This condition may increase the likelihood of error in some situations. A team error is a breakdown of one or more members of a work group that allows other individual members of the same group to err—due to either a mistaken perception of another’s abilities or a lack of accountability within the individual’s group.

The logic diagram below illustrates the mathematical impact of such a dependency, using the example of a supervisor (or peer) checking the performance of a maintenance technician. Assuming complete independence between the technician and the supervisor, the overall likelihood for error is one in a million; the overall task reliability is 99.9999 percent. However, should the supervisor (or peer) assume the technician is competent for the task and does not closely check the technician’s work, the overall likelihood for error increases to one in a thousand, the same likelihood as that for the technician alone. Overall task reliability is now 99.9 percent. System reliability is only as good as the weakest link, especially when human beings become part of the system during work activities. The perception of another’s capabilities influenced the supervisor’s decision not to check the technician’s performance—a team error.

Several socially related factors influence the interpersonal dynamics among individuals on a team. Because individuals are usually not held personally responsible for a group’s performance, some individuals in a group may not actively participate. Some people refrain from becoming involved, believing that they can avoid answerability for their actions, or they “loaf” in group activities.36 Team errors are stimulated by, but are not limited to, one or more of the following social situations.

  • Halo Effect – Blind trust in the competence of specific individuals because of their experience or education. Consequently, other personnel drop their guard against error by the competent individual, and vigilance to check the respected person’s actions weakens or ceases altogether. This dynamic is prevalent in hospital operating rooms, where members of the operating teams often fail to stay vigilant and check the procedures and actions in progress because a renowned surgeon is leading the team and there are several other sets of eyes on the task at hand. Each year it is estimated that there are between 44,000 and about 90,000 deaths attributable to medical errors in hospitals, alone. Never mind the transfusions of mismatched blood plasma, amputations of the wrong limbs, administration of the wrong anesthesia, or issuance of the wrong prescriptions. It is the medical instruments, sponges, towels, and the like left in patients’ bodies following surgery that are hard for laymen to understand.
  • Pilot/Co-Pilot – Reluctance of a subordinate person (co-pilot) to challenge the opinions, decisions, or actions of a senior person (pilot) because of the person’s position in a group or an organization. Subordinates may express “excessive professional courtesy” when interacting with senior managers, unwittingly accepting something the boss says without critically thinking about it or challenging the person’s actions or conclusions.

Example of Pilot/Co-Pilot Error

A classic example of this dynamic occurred between the pilot and co-pilot of Air Florida Flight 90 at Washington’s National Airport in January 1982. The temperature was about 25 degrees. It had snowed hard for some time while the plane sat on the ground during airport closure because of weather. The aircraft had not properly been de-iced, and there was snow on the leading edges of the wings as the flight crew prepared for takeoff. During the after-start checklist procedure, the co-captain called out “engine anti-ice system.” And the captain reported, “engine anti-ice system off,” and then failed to turn it on. The system should have been on. Consequently, ice interfered with the engine pressure ratio (EPR) system, the primary indication of thrust being developed by the engines. The co-pilot called the captain’s attention to the anomalous engine indications at least five times in the last moments before the plane rotated off the runway, but he did not oppose the captain’s decision to continue takeoff. Given the engine indications, he should have insisted the takeoff be aborted. (All other engine parameters were later found to be well below limit values.) The pilot thought the EPR settings were at the indicated limits when he took off; in reality, the aircraft had only three-fourths of the necessary thrust in both engines. The plane failed to achieve adequate lift. It hit the 14th Street Bridge and plunged nose down into the freezing Potomac River, killing 74 of the 79 people on board.

  • Free Riding – The tendency to “tag along” without actively scrutinizing the intent and actions of the person(s) doing the work or taking the initiative. The other person takes initiative to perform the task, while the free-riding individual takes a passive role in the activity.

Example of Free Riding Error

The water flushing of compound salts inside the transfer piping at the fertilizer and pesticide plant in Bhopal, central India, was a routine task. The flushing operation was normally carried out under the direction of a shift maintenance supervisor. On December 2, 1984, the maintenance supervisor was called to another assignment, and the flush was carried out under the direction of the operations supervisor. A new compound, methyisocyanate (MIC), was used to produce the pesticide Sevin at the plant. MIC is unstable and highly reactive to water. The procedure to ensure isolation of water from a MIC tank during piping flushes was to close the valve to the tank, and then insert a slip blind (blank flange) into the piping to make sure that water did not leak through the valve and enter a MIC storage tank. During the investigation of the accident, an operator testified that he noticed the closed valve had not been sealed with a slip blind (metal disc), but he said, “It was not my job to do anything about it.”

  • Groupthink – Cohesiveness, loyalty, consensus, and commitment to the team are all worthy attributes of a team. However, at times, these characteristics can work against the quality of team decisions. There can be a reluctance to share contradictory information about a problem for the sake of maintaining the harmony of the work group. This is detrimental to critical problem-solving. This dynamic can be made worse by one or more dominant team members exerting considerable influence on the group’s thinking (pilot/co-pilot or halo effect). Consequently, critical information known within the group may remain hidden from other team members. Groupthink can also result from subordinates passing on only “good news” or “sugar-coating” bad news so as to not displease their bosses or higher level managers. The symptoms of groupthink are as follows:
    • Illusion of invulnerability – Creates excessive optimism and encourages extreme risk taking.
    • Collective rationalization – Discounts warnings that might lead to reconsidering assumptions before recommitting to past decisions.
    • Unquestioned morality – Inclines members to ignore the ethical or moral consequences of decisions because of unquestioned belief in the group’s inherent morality.
    • Stereotyped view – Characterizes the opposition as too evil for genuine negotiation or too weak and stupid to effectively oppose the group’s purposes.
    • Direct pressure – Discourages dissent by any member who expresses strong arguments against any of the group’s stereotypes, illusions, or commitments that this type of dissent is contrary to what is expected of loyal members.
    • Self-censorship – Reduces deviations from the apparent group consensus reflecting each member’s inclination to minimize to himself the importance of his doubts and counter arguments.
    • Illusion of unanimity – Shared by members with respect to the majority view (partly resulting from self-censorship of deviations, augmented by a false assumption that silence means consent).
    • Self appointed mind-guards – Emerge from the members to protect the membership from adverse information that might shatter their shared complacency about the effectiveness and morality of their decisions.
  • Diffusion of Responsibility often causes a “risky shift” in decision-making and problem resolution. It involves the tendency to gamble with decisions more as a group than if each group member was making the decision individually—responsibility is diffused in a group. As the saying goes, “there is safety in numbers.” If two or more people agree together that they know a better way to do something, they will likely take the risk and disregard established procedure or policy. This has been referred to as a “herd mentality.”

Example of Diffusion of Responsibility Error

At a DOE production facility in the late 1980s, a shift manager in the operating contractor organization, along with a small group of shift supervisors, planned and carried out the replacement of a faulty pump over a weekend. This undertaking was performed to support the startup of a system that had been shut down for an unusually long time. Operating within the work control system to get the job done had not been successful. Continued reliance on that system, the supervisory group reasoned, would not get a new pump in place, and the stream would continue to be unusable. Faced with pressures to meet a “startup” schedule, and frustrated with their inability to get work done through routine channels, the men took matters into their own hands and did the work themselves. In so doing, the team violated numerous procedures governing the work control system, in-process quality inspections, the worker certification program, and the union labor rules governing work assignments and responsibilities. No single salaried supervisor would have considered doing a union mechanic’s job on his/her own. In a group situation, given the urgency, it seemed to make good sense. The outcome for these men included days off without pay and a demotion for the shift manager

The following strategies tend to reduce the occurrence of team errors.

  • Maintain freedom of thought from other team members.
  • Challenge actions and decisions of others to uncover underlying assumptions.
  • Train people on team errors, their causes, and intervention methods.
  • Participate in formal team-development training.
  • Practice questioning attitude/situational awareness on the job and during training.
  • Designate a devil’s advocate for problem-solving situations.
  • Call “timeouts” to help the team achieve a shared understanding of plant or product status.
  • Perform a thorough and independent task preview before the pre-job briefing.
Personal Dependencies

An unsafe personal dependency exists when an individual relies on his or her personal experience, proficiency, or qualifications to maintain control. Because past practices have not led to a problem, the individual becomes indifferent toward the need for care and attention. Competence does not guarantee positive control. At the beginning of this chapter, “Traps of Human Nature” and “Unsafe Attitudes” were discussed regarding their impact on human fallibility. Such psychological and physiological factors can create unsafe personal dependencies and lead to error. Of particular concern is overconfidence in one’s own ability at a critical step, inhibiting the rigorous use of human performance tools. Overcoming personal dependencies usually involves:

  • training that addresses the limitations of human nature;
  • promoting a culture that supports situational awareness and a questioning attitude;
  • reinforcing and coaching the proper application of human performance tools during in-field observations; and
  • improving the knowledge of risk-important equipment and critical steps.

PERFORMANCE MODES (Essential Reading)

Information Processing, Memory, and Attention

Cognition is the mental process of knowing. It is our mental activity encompassing perception, mental imagery, thinking, remembering, problem solving, decision-making, learning, language, and conscious direction of motor activities. Cognitive psychology is the study of how we process information from our environment; how we attend to, perceive, process, and store information; and how we retrieve and act on information from memory. To better anticipate and prevent error, we need to better understand how people process information. Psychologists have explained memory in terms of three basic components. Refer to the graphic below on information processing and memory:

  • Sensory Memory – Each sensory system, (sight, touch, smell, taste, and hearing) has corresponding sensory memory, or sensory register, or store. Each sensory memory briefly stores and transforms the stimuli it receives into a form that can be processed by short-term memory. All incoming information is not processed. Information that is not “attended to” decays or is “overwritten” by new incoming stimuli.
  • Short-Term Memory – Short-term memory (STM) receives, holds, and processes information from the sensory memory. Processing in STM is necessary before information can be transferred and retained in long-term memory. Short-term or “working” memory has limited storage capacity, as the name implies. Information entering short-term memory
    “decays” after about 12 to 30 seconds, unless it is “rehearsed” or otherwise consciously attended to and encoded for transfer into long-term memory. STM also retrieves information from long-term memory when needed.
  • Long-Term Memory – Long-term memory (LTM) receives information from short-term memory and stores it indefinitely. LTM capacity is considered unlimited for practical purposes. LTM holds all of the learning and memory of our life experience. Information that is stored in long-term memory is retrieved by short-term memory to support recall and recognition.

The shared Attention Resources depicted in the model by Wickens, above, enables the mind to attend to information while performing one or more tasks (such as driving a car and talking with a passenger at the same time). How much attention is required to perform satisfactorily defines the mental workload for an individual, as some tasks require more attention than others. Knowledge, skill, and experience with a task decrease the demand for attention.

Humans control their actions through various combinations of two control modes—the conscious and the automatic. The conscious mode is restricted in capacity, slow, sequential, laborious, error-prone, but potentially very smart. This is the mode we use for ‘paying attention’ to something. It is needed for handling entirely novel problems, trained-for problems, or problems for which procedures have been written.

The automatic mode of control is the opposite in all respects. It is largely unconscious. The automatic mode is seemingly limitless in its capacity. It is very fast and operates in parallel; that is, it does many things at once rather than one thing after another. It is effortless and essential for handling the recurrences of everyday life—the highly familiar, everyday situations. But it is a highly specialized community of knowledge structures. It knows only what it knows; it is not a general problem-solver, like consciousness.

We do not experience reality exactly as it exists, but as our experience and memories cause us to perceive it. Our sensory systems detect and take in stimuli from the environment in the form of physical energy. Each sensory receptor type is sensitive to only one form of energy. These receptors convert this energy into electrochemical energy that can be processed by the brain. However, our perception involves more than the receipt of sensory information. We must attend to, select, organize, and interpret this information to meaningfully recognize objects and events in our environment. Our interpretation of sensory information requires retrieval from long-term memory. Our prior experience and knowledge, emotional state, and value system (including prejudices) determine our perceptions.

In summary, the information-processing model depicts sensory stimuli entering short-term sensory store, where they are transformed into a form that the perceptual processes within the brain can understand. Processed stimuli are transferred to working memory. Working memory draws upon and interacts with long-term memory to develop our perception of the world and to determine our response to these perceptions. The retrieval and processing of long-term memories by STM enable us to function in the world.

Although the brain is designed for information transfer, sometimes it fails. Error is a function of how the brain processes information related to the performance of an activity. When people err, there is typically a fault with one or more of the following stages of information processing.

  • Sensing – Visual, audible, and other means to perceive information in one’s immediate vicinity (displays, signals, spoken word, or cues from the immediate environment). Recognition of information is critical to error-free performance.
  • Thinking – Mental activities involving decisions on what to do with information. This stage of information-processing involves interaction between one’s working memory and long-term memory (capabilities, knowledge, experiences, opinions, attitudes).
  • Acting – Physical human action (know-how) to change the state of a component using controls, tools, and computers; includes verbal statements to inform or direct others.
  • Attention – Determines what information is transmitted to the mind’s working (short-term) memory. The amount of stimuli that can be taken in by our sensory systems is considered to be unlimited. However, the amount of information that can be held in working memory is limited to 7 + 2 items. Working memory therefore, creates a “bottleneck” for incoming information. In a sense, it is a bottleneck with a purpose—otherwise we would be inundated with irrelevant stimuli.

Attention is also influenced by the following:

  • Expectancy – We direct our sensory receptors—eyes, ears, nose, fingers to where we anticipate locating information within our environment. Surprise occurs when events differ from our expectations.
  • Relevance – We seek information/stimuli relevant to our immediate tasks and our goals.

Our attention constantly shifts as a result of voluntary direction (internal) or automatically as a result of attention attracting stimuli (external) in the environment. Our focus of attention results from whether a stimulus activates top-down (internal) or bottom-up (external) processes.

  • Top-Down – Attention control is conscious direction, using information residing in memory stores. It is also termed concept-driven or effortful attention. Top-down attention is purposefully directed and is influenced by expectancy and relevance, as well as prior knowledge and experience. Examples are a search task, such as when looking for the face of a friend in the crowd, seeking a specific item on a control display, or conducting a parts inspection. Top-down attention is slower than bottom-up attention.
  • Bottom-Up – Attention is captured by external stimuli, usually unexpected events or salience. This is also termed data-driven or automatic attention. Examples are a bright flash of light, a loud sound, loss of balance due to slippery conditions, or impact by an object. Bottom-up attention is very rapid, reaching its maximum 100-200 milliseconds after stimulus perception.

Inattention to detail is an often-cited cause of human performance problems. Avoiding error is not as simple as telling someone to “pay attention.” First of all, attention is a limited commodity; second, we can only attend to a very small proportion of the available sense data and; third, unrelated matters can capture our attention. There are three attention modes. Attention can be focused, divided, or selective. If attention is focused, something has to be ignored. If attention is strongly drawn to one particular thing it is necessarily withdrawn from other competing concerns. Divided attention involves paying attention to two or more sources of information on a time-share basis, similar to using a flashlight in a dark room trying to see two different items, moving the flashlight back and forth. Divided attention can be dangerous; for example, a driver’s attention is significantly distracted while using a cell phone. Selective attention means an individual gives preference to distinct information, such as one’s name in a noisy meeting room. It is impossible for humans to pay attention to everything all the time. This can lead to the occasional error. The likelihood of error is enhanced when someone attempts to do more than one activity in one stage of information processing (sensing, thinking, acting), such as listening to the radio and a passenger simultaneously while driving an automobile. This is why it is so important to control the environment in which people work by minimizing interruptions and distractions or other stimuli that can negatively affect a performer’s attention capabilities. Trained, experienced operators can consciously attend to a maximum of two or three channels of information (such as flow, temperature, pressure) and still be effective.58 Beyond that, error is likely due to limited attention resources of human nature.

Jens Rasmussen developed a classification of the different types of information processing involved in industrial tasks. This influential classification system is known as the Skill, Rule, Knowledge based (SRK) approach. Rasmussen’s scheme suggests a useful framework for identifying the types of error likely to occur in different operational situations, or within different aspects of the same task where different types of information processing demands on the individual may occur. The terms skill, rule, and knowledge based information processing refer to the degree of conscious control exercised by the individual over his or her activities. Tasks individuals perform every day on the job vary from doing a lot and thinking a little to thinking a lot and doing a little. Depending on the situation, as perceived by the individual, he or she will conduct work according to the level of performance that seems adequate to control the situation. The level of performance is a function of the familiarity an individual has with a specific task and the level of attention (information processing) a person applies to the activity.

Example Uses of Performance Levels

The three performance levels can be readily applied to a familiar activity like driving an automobile. For an experienced driver, the control of speed and direction of the vehicle occur almost entirely at the skill-based level (an automatic mode of control). Things related to how the driver relates to other drivers on the road are covered by rules (speed limit, distance from other cars, right of way, etc.) of the kind if (situation X occurs) do—or don’t do—(action Y). Here the driver is in a rule-based level of performance. While traveling at a good clip along a main highway, the driver hears on the radio that there is a traffic jam up ahead. To continue will result in long delays. So the driver has to use his/her knowledge of directions and road connections and accesses to find an alternative route. This problem-solving ”mind-work”’ occurs at the knowledge-based level (conscious mode) .

Generic Error Model System (GEMS)

The GEMS model illustrates how humans make use of information processing for a particular task and how they move from one performance level to another as they complete a task. The flowchart illustrates the distinctions between the three levels of performance. How GEMS is applied can be illustrated by an example.

Example Application of GEMS

A process worker is monitoring a control panel in a batch processing plant. The worker executes a series of routine operations such as opening and closing valves and turning on agitators and heaters. Since the worker is highly practiced, he is carrying out the valve operations in an automatic skill-based manner, only occasionally monitoring the situation at the points indicated by the “OK?” boxes at the skill-based level. If one of these checks indicates that a problem has occurred, perhaps indicated by an alarm, the worker then enters the rule-based level to determine the nature of the problem. This may involve gathering information from various sources such as dials, chart recorders, and VDU screens, which is then used as input to a diagnostic rule of the following form: <IF> symptoms are X <THEN> cause of the problem is Y. Having established a plausible cause of the problem on the basis of the pattern of indications, an action rule may then be invoked of the following form: <IF> the cause of the problem is Y <THEN> do Z. If, as a result of applying the action rule, the problem is solved, the worker will then return to the original skill-based sequence. If the problem is not resolved, then further information may be gathered in order to try to identify a pattern of symptoms corresponding to a known cause. If the cause of the problem cannot be established by applying any available rule, the worker may then have to revert to the knowledge-based level. It may become necessary to utilize chemical or engineering knowledge to handle the situation.

As shown in the above example, uncertainty declines as knowledge about a situation improves (learning and practice). Consequently, familiarity (knowledge, skill, and experience) with a task will establish the level of attention or mental functions the individual chooses to perform an activity. As uncertainty increases, people tend to focus their attention to better detect critical information needed for the situation. People want to boost their understanding of a situation in order to respond correctly.

Skill-Based Performance

Skill-based performance involves highly practiced, largely physical actions in very familiar situations in which there is little conscious monitoring. Such actions are usually executed from memory without significant conscious thought or attention (see illustration). Behavior is governed by preprogrammed instructions developed by either training or experience and is less dependent upon external conditions.

Information that can be processed with little or no allocation of attention resources is called automatic processing. When skills are learned to the point of being automatic, the load on working memory typically is reduced by 90 percent.64 This occurs after extensive practice of a task so that, literally, it can be performed “without thought.” Many actions in a typical day are controlled unconsciously by human instinct, such as keyboarding, writing one’s signature, taking a shower, driving a car. In the skill-based mode, the individual is able to function very effectively by using pre-programmed sequences of behavior that do not require much conscious control. It is only occasionally necessary to check on progress at particular points when operating in this mode.

Examples of Skill-Based Activities

Examples of skill-based activities for well-trained and practiced individuals include:

  • mowing the lawn;
  • using a hammer or other hand tool;
  • controlling various processes manually (such as pressure and level),
  • hanging a tag;
  • analyzing chemical composition of a routine sample;
  • performing repetitive calculations;
  • using measure and test equipment;
  • opening a valve;
  • taking logs; and
  • replacing parts during maintenance.

Error Modes are the prevalent ways, not the only ways, people err for the particular performance mode. Error modes are generalities that aid in anticipating and managing error-likely situations aggravated by inattention, misinterpretation, and inaccurate mental models.

Skill-Based Error Mode – Inattention

The error mode for skill-based performance is inattention. Skill-based errors are primarily execution errors, involving action slips and lapses in attention or concentration. Errors involve inadvertent slips and unintentional omissions triggered by simple human variability or by not recognizing changes (note the Δ symbol on the above chart) in task requirements, system response, or facility conditions related to the task. Some examples of errors committed while in the skill-based performance level follow.

  • When addressing an envelope, he put his old address in the return box instead of his new
    (correct) address.
  • She forgot to drop off shoes at the shoe shop to be repaired, and instead drove right past the shoe shop and straight to her home.
  • An electrician had been asked to change a light bulb that indicated whether a hydraulic on/off switch was selected. The hydraulic system was being worked on, and the electrician was aware that it would be unsafe to activate the system. Nevertheless, after changing the bulb, and before he had realized what he was doing, he had followed his usual routine and pushed the switch to the ”on” position to test whether the light was now working.
  • Intending to shut down lines A and B, the operator also pressed the “shut-off’“ control buttons for lines C and D.

Under ideal conditions, the chance for error is less than 1 in 10,000, according to a study in the nuclear power industry. People most often possess an accurate understanding of the task and have correct intentions. Roughly 90 percent of a person’s daily activities are spent in the skill-based performance mode. However, only 25 percent of all errors are attributable to skill-based errors in the nuclear power industry. Potentially, a person can be so focused on a skill-based task that important information in the work place is not detected. Another concern for skill-based tasks is that people are familiar with the task. The greater the familiarity, the less likely perceived risk will match actual risk. People become comfortable with risk and eventually grow insensitive to hazards. Several tools in the Engineering Human Performance Optimization in the Workplace Part 2 course are designed to help anticipate, prevent or catch skill-based errors ( task preview, job-site review, questioning attitude, stop when unsure, self-checking, pre-job briefing, place-keeping, peer check and concurrent verification).

Rule-Based Performance

People switch to the rule-based performance level when they notice a need to modify their largely pre-programmed behavior because they have to take account of some change in the situation. The work situation has changed such that the previous activity (skill) no longer applies. This problem is likely to be one that they have encountered before, or have been trained to deal with, or which is covered by the procedures. It is called the rule-based level because people apply memorized or written rules. These rules may have been learned as a result of interaction with the facility, through formal training, or by working with experienced workers. The level of conscious control is between that of the knowledge- and skill-based modes. The rule-based level follows an IF (symptom X), THEN (situation Y) logic. In applying these rules, we operate by automatically matching the signs and symptoms of the problem to some stored knowledge structure. So, typically, when the appropriate rule is applied, the worker exhibits pre-packaged units of behavior. He/she may then use conscious thinking to verify whether or not this solution is appropriate.
The goal in rule-based performance is to improve one’s interpretation of the work situation so that the appropriate response is selected and used. This is why procedures are prepared for situations that can be anticipated. Procedures are pre-determined solutions to possible work situations that require specific responses. Rules are necessary for those less familiar, less practiced work activities for which a particular person or group is not highly skilled. Not all activities guided by a procedure are necessarily rule-based performance. In normal work situations, such activities are commonly skill-based for the experienced user.

Examples of Rule-Based Activities

Examples of rule-based activities include:

  • deciding whether to replace a ball bearing inspected during preventive maintenance;
  • responding to a control board alarm;
  • estimating the change in tank level based on a temperature change (thumb rules);
  • feeling equipment for excessive vibration or temperature on operator rounds;
  • performing radiological surveys;
  • using emergency operating procedures; and
  • developing work packages and procedures.
Rule-Based Error Mode

Since rule-based activities require interpretation using an if-then logic, the prevalent error mode is misinterpretation. People may not fully understand or detect the equipment or facility conditions calling for a particular response. Errors involve deviating from an approved procedure, applying the wrong response to a work situation, or applying the correct procedure to the wrong situation. Examples of errors committed when working in the rule based performance level include the following.

  • A driver was about to pull out into the traffic flow following a stop at the side of the road. He checked the side-view mirror and saw a small green car approaching. He briefly checked his rear-view mirror (which generally gives a more realistic impression of distance) and noted a small green car some distance away. He then pulled out from the shoulder of the road and was nearly hit by a small green car. There were two of them, one behind the other. The driver assumed they were one and the same car. The first car had been positioned so that it was only visible in the side-view mirror.
  • The technician knew that normal tire pressures in automobile tires is 32-35 psi. So, when he was required for the first time to air up a smaller, temporary-use automobile tire, he filled the tire to the customary 35 lbs. In actuality, small-diameter, temporary-use tires are aired up to 55- 60 psi.
  • A northbound commuter train in London in 1988 ran into the back of a stationary train after having passed a green signal. Thirty-five people died and 500 were injured. The signal light had given the wrong signal because the old signal wires had come into contact with nearby equipment for the new signal system that caused a wrong-side signal failure. The light should have shown red, for stop. The electrician who had wired the signal on the new system just the day before the accident had never been properly trained. He failed to cut off or tie back, and then insulate, old wires as he wired in the new signal system. He merely bent old wires back out of the way. The untrained technician had learned bad habits on his own that became his “strong but wrong rules.” His application of a bad rule went uncorrected.

The chance for error increases when people make choices or decisions, especially in the field. Rule-based and knowledge-based performance modes involve making choices. With less familiarity for the activity, the chance for error increases to roughly 1 in 1,000. In terms of reliability, this is still very good (99.9 percent). In the nuclear power industry, studies have shown that roughly 60 percent of all errors are rule-based. Engineering Human Performance Optimization in the Workplace Part 2 includes tools to help anticipate, prevent or catch rule-based errors. They include for example: task preview, procedure adherence, pre-job briefing, questioning attitude, peer-checking and concurrent verification among others.

Knowledge-Based Performance

Warning; the terminology of knowledge-based performance can be confusing! It is tempting to think that much of the engineering design work and the scientific investigations and research at DOE laboratories falls in the knowledge based category – simply because such work is performed by highly knowledgeable people. We must however at all costs avoid the temptation to shrug off the essential nuances and simply argue that since we do research or one of a kind work, and our people are highly educated and skilled, then our work is knowledge based. The truth is quite the opposite. The situation described as “knowledge based mode” might better be called “lack of knowledge” mode.

Knowledge based work, as defined by Rasmussen, generally means that we don’t really understand what we are doing. Clearly, that is not the case with most organization’s work. Even in the most cutting edge science, the ability to develop and conduct controlled experiments depends on control; keeping the uncontrolled variables as few as possible so that we may observe the results of the experiment in order to hypothesize, test theories and ultimately develop new knowledge. It in fact might be argued that the accomplished researcher has highly refined abilities to work in skill and rule modes in order to be able to work in knowledge mode, since working in knowledge mode is so difficult.

Not all hazards, dangers, and possible scenarios can be anticipated in order to develop appropriate procedures. Even training is unable to anticipate all possible situations that can be encountered. There are some situations in which no procedure guidance exists and no skill applies. Dr. James Reason concludes that the knowledge-based level of performance is something we come to very reluctantly. Humans only resort to the slow and effortful business of thinking things through on the spot after they have repeatedly failed to find some pre-existing solution.
Hence, knowledge-based behavior is a response to a totally unfamiliar situation (no skill or rule is recognizable to the individual The person must rely on his or her prior understanding and knowledge, their perceptions of present circumstances, similarities of the present situation and similarities to circumstances encountered before , and the scientific principles and fundamental theory related to the perceived situation at hand. People enter a knowledge-based situation when they realize they are uncertain (see the ? symbol on previous chart) about what to do. If uncertainty is high, then the need for information becomes paramount. To effectively gain information about what we are doing or about to do, our attention must become more focused.79
Knowledge-based situations are puzzling and unusual to the individual. Often our understanding of the problem is patchy, inaccurate, or both. In many cases, information sources contain conflicting data, too much data, or not enough data, amplifying the difficulty of problem-solving. Additionally, consciousness is very limited in its capacity to hold information, storing no more than two or three distinct items at a time. Consciousness tends to behave like a leaky sieve, allowing things to be lost as we turn our attention from one aspect of the problem to another. Because uncertainty is high, knowledge-based tasks are usually stressful situations.

Examples of Knowledge-Based Activities

Knowledge-based activities involve problem-solving. Such situations require the use of fundamental knowledge of processes, systems, and so on—“thinking on your feet.” Examples of common problem-solving situations include the following:

  • troubleshooting;
  • performing an engineering evaluation of a new design;
  • reviewing a procedure for ‘intent of change;’
  • resolving conflicting control board indications;
  • holding meetings to address problems;
  • conducting scientific experiments;
  • resolving human performance problems;
  • planning business strategies, goals, and objectives;
  • performing root cause analysis of events;
  • conducting trend analyses;
  • designing equipment modifications;
  • making budget allocation decisions
  • allocating resources;
  • changing policies and expectations; and
  • performing an engineering calculation.
Knowledge-Based Error Mode

Knowledge-based activities require diagnosis and problem-solving. There are considerable demands on the information-processing capabilities of the individual that are necessary when a situation has to be evaluated from first principles. It is not surprising that humans do not perform very well in high stress, unfamiliar situations where they are required to ‘think on their feet’ in the absence of rules, routines, and procedures to handle the situation. People tend to use only information that is readily available to evaluate the situation. Also, problem solvers often become over-confident in the correctness of their knowledge; an “I know I’m right” effect. They also become enmeshed in one aspect of the problem to the exclusion of all other considerations.82 Decision-making is erroneous if problem-solving is based on inaccurate information. Often, decisions are made with limited information and faulty assumptions. Consequently, the prevalent error mode is an inaccurate mental model of the system, process, or facility status. Under such circumstances, the chance for error is particularly high, approximately one in two (50 percent) to one in ten.83 In the nuclear power industry, studies indicate that roughly 15 percent of all errors are knowledge-based. Engineering Human Performance Optimization in the Workplace Part 2 provides several tools to help anticipate, prevent, or catch knowledge-based errors. They include, for example, technical task pre-job briefing; project planning; problem-solving; decision-making; and peer review.

How Performance Modes Can be Used

A better contextual understanding of individuals’ conscious and automatic behaviors as described in the skill, rule, and knowledge performance modes, and knowing the kinds of errors individuals tend to make while working in those various modes, can be extremely useful. Managers responsible for establishing and maintaining effective controls can make good use of this information. Workers need accurate, complete, and unambiguous procedures and guides for reference when doing rule-based work. They may also need access to a subject matter expert when making choices about the rules to select and for correct application of those rules. Workers performing skill-based work need adequate tools to minimize action slips, and they need to be free from interruptions and distractions that aggravate concentration, divide their attention, and contribute to lapses in memory that cause error. When working in skill-based performance mode, workers may benefit from simple job aids and reminders. On the other hand, for individuals working in the knowledge-based mode, where their understanding of the problem is often patchy, or inaccurate, or both, and where the slow and effortful business of thinking things through is needed, collaboration with a small team of thoughtful, committed, and experienced individuals is needed to help in problem-solving and decision-making. Individuals performing work in any of the performance modes can benefit from the use of the error-reduction tools addressed later in this chapter.

When errors and mistakes of consequence occur that indicate some corrective action is needed to minimize recurrence, knowing the work processing method or performance mode the individual was working in is instructive. All too often, workers involved in skill-based performance who err are scheduled for retraining as a logical solution. But, retraining workers to do work that is already basically memorized and automatic, performed with little conscious thought because of the nature of the work, is a waste of time and is an insult to the worker. It is very hard to train a worker not to repeat something he or she did not intend to do in the first place. Training is not the solution in these instances. Observations of work can be very beneficial. People don’t always know why something went wrong.

Observations are used to gather data about the worker behaviors, the job-site conditions, and organizational support that may have been wanting. Inadequate tools, incomplete work packages, scheduling conflicts, poorly written procedures, excessive noise, extreme heat or cold, poor lighting, and so on, may be contributing factors to poor performance. Some one-on-one time with the individual may be in order. The purpose is to learn of the circumstances surrounding the slip, trip, or lapse and what, if anything, can be changed in the work environment or with the individual to eliminate a similar reoccurrence. The error may have been provoked by fatigue and stress; the worker may have lost sleep worrying about a teenager who left home. It may be that the worker has become complacent and was careless. Distractions and interruptions may have disrupted the worker’s concentration and that led to the error. Those conditions can be controlled.
Errors that occur when working in rule-based performance may be corrected through retraining. Generally, the worker has misinterpreted a requirement or a “rule.” He or she has applied a bad rule to a given application; or, conversely, has used a good rule in a wrong application. In these instances, understanding requirements and knowing where and under what circumstance those requirements apply is cognitive in nature and must be learned or acquired in some way. Rule-based errors can be caught or mitigated by individuals exhibiting a questioning attitude, by calling a time out, or by stopping work when they are unsure. Peer checks can also be used to stop someone from committing a consequential error.

Corrective action to reduce knowledge-based mistakes is more complicated. An analysis of what went wrong will need to be carried out to formulate a corrective action. It may be that the person’s understanding and knowledge of the system and the scientific principles and fundamental theory related to the system were inadequate. Training or retraining could help. It may be that people’s technical knowledge was adequate, but that the three individuals working on the problem lacked problem-solving skills, fell victim to team errors, or failed to effectively communicate with each other in order to solve the problem. Perhaps the team could not make good decisions in an emergency. Coaching is a pro-active solution to helping individuals eliminate error when working in any performance mode, but is particularly adept for knowledge-based performance modes. Peer-evaluations are also effective in this instance.

Mental Models

A person handles a complex situation by simplifying the real system into a mental image he/she can remember (such as a simple one-line drawing). A mental model is the structured understanding of knowledge (facts or assumptions) a person has in his or her mind about how something works or operates (for example, facility systems). Mental models are used in all performance modes. In fact, mental models give humans the ability to detect skill-based slips and lapses. They aid in detecting deviations between desired and undesired system states, such as manually controlling tank water level. A mental model organizes knowledge about the following.

  • what a system contains
  • why it works that way
  • fundamental laws of nature
  • how components work as a system
  • current state of a system

An individual’s mental model may reflect (1) the true state of the system, (2) a perceived state of the system, or (3) the expected state of the system that is developed through training and experience with the system and recent interactions with the system. Note that all mental models are inaccurate to some extent because of the limitations of human nature.

It is important to remember that knowledge-based performance involves problem-solving, and mental models should be considered explicitly when a team works on a problem. Team members should agree with the model they intend to use to diagnose and solve a problem. Otherwise, misunderstandings and assumptions may occur. Frequent time-outs can help teams keep mental models up to date.

Assumptions

Knowledge-based situations can be stressful, anxious situations. Assumptions reduce the strain on the mind, allowing a person to think without excessive effort. Assumptions are necessary at times to help constrain a problem. Consequently, assumptions tend to occur more often when people experience uncertainty, leading to trial-and-error and cause-and-effect problem-solving approaches. Assumptions also occur as an outgrowth of unsafe attitudes and inaccurate mental models. Statements such as “I think …,” “We’ve always …,” or “I believe …” are hints that an assumption has been or is being made. These phrases are known as “danger words.” Inaccurate mental models, in turn, can promote erroneous assumptions that may lead to errors.

Often, assumptions are treated as fact. Challenging assumptions is important in improving mental models, solving problems, and optimizing team performance. Assigning a devil’s advocate in a critical problem-solving situation may be worthwhile to achieve a better solution. Also, challenging assumptions helps detect unsafe attitudes and inaccurate mental models. A devil’s advocate can challenge assumptions using the following process.

  • Identify conclusion(s) being made by another person or yourself.
  • Ask for or identify the data that leads to the conclusion(s). “How did you get that data?” “What is the source of your concern?”
  • Ask for the reasoning (mental model) that connects the data with the conclusion. “Do you mean…?” “Why do you feel that way?”
  • Infer possible beliefs or assumptions.
  • Test the assumption with the other person. “What I hear you saying is…”
Mental Biases – Shortcuts

Humans tend to seek order in an ambiguous situation and to seek patterns they recognize. Mental biases, or mental shortcuts, offer the human mind several unconscious methods to create order and simplicity amid uncertainty, reducing mental effort. Personnel should be aware of the potential for error that mental biases and mental shortcuts create during problem-solving and decision-making, such as troubleshooting and diagnostics during emergency operation. More will be said about underlying unconscious assumptions and taken-for-granted beliefs in the opening pages of Chapter 5 on organizational culture. In some form or another, all humans use mental biases. Biases were discussed earlier in this chapter with respect to the limitations of human nature and include the following, among others:

  • confirmation bias;
  • similarity bias;
  • frequency bias;
  • availability bias;
  • representative bias; and
  • framing bias.
Conservative Decisions

To be conservative means to be cautious and protective of what is truly important—safety, reliability, quality, security, and so on. It is an attitude that operational and personnel safety must be protected regardless of current schedule and production pressures. In light of the limitations of human nature, it makes sense to be conservative, especially when a decision potentially affects operational or personnel safety. Who knows what information is missing or what data was not considered? A systematic, team-based approach is called for so that safety considerations are not compromised. In several INPO documents related to conservative decision-making, the following factors are repeatedly mentioned as important to success in making conservative decisions.

  • Recognize conditions that could challenge safety and reliability.
  • Place structures, systems, and components in a known safe condition when uncertain.
  • Seek prompt assistance from persons with relevant expertise.
  • Avoid hasty decisions and hurried actions.
  • Assign roles and responsibilities.
  • Explore and evaluate alternatives rigorously, asking challenging questions to confirm technical assumptions.
  • Understand the potential consequences to safety and reliability of various alternatives.
  • Adopt a deliberate and carefully controlled approach.
  • Make a deliberate decision, providing clear direction, roles and responsibilities, contingencies, and abort criteria.
  • Do not proceed in the face of uncertainty.

ERROR-LIKELY SITUATIONS (Essential Reading)

The second principle of human performance states: “error-likely situations are predictable, manageable, and preventable.” An error-likely situation comes into play when task-related factors exceed the capabilities of the individual, creating a mismatch at the point when the individual is “touching” either the physical or the paper plant. The simple presence of adverse conditions cannot be error-likely unless a specific action is to occur within that set of adverse conditions. The elements of error likely situations appear in the graphic below.

Error Precursors

Error precursors are unfavorable conditions embedded in the job site that create mismatches between a task and the individual. Error precursors interfere with successful performance and increase the probability for error. Simply stated, they are conditions that provoke error. They can be organized into one or more of the following four categories.

  • Task Demands – Specific mental, physical, and team requirements to perform an activity that may either exceed the capabilities or challenge the limitations of human nature of the individual assigned to the task. Task demands include physical demands, task difficulty, and complexity. Examples include excessive workload, hurrying, concurrent actions, unclear roles and responsibilities, and vague standards.
  • Individual Capabilities – Unique mental, physical, and emotional characteristics of a particular person that fail to match the demands of the specific task. This involves cognitive and physical limitations. Examples are unfamiliarity with the task, unsafe attitudes, level of education, lack of knowledge, unpracticed skills, personality, inexperience, health and fitness, poor communication practices, fatigue, and low self-esteem.
  • Work Environment – General influences of the workplace, organizational, and cultural conditions that affect individual behavior. These include distractions, awkward equipment layout, complex tagout procedures, at-risk norms and values, work group attitudes toward various hazards, work control processes, and temperature, lighting, and noise.
  • Human Nature – Generic traits, dispositions, and limitations that may incline individuals to err under unfavorable conditions such as habit, short-term memory, stress, complacency, inaccurate risk perception, mind-set, and mental shortcuts.

Error precursors are, by definition, prerequisite conditions for error and, therefore, exist before an error occurs. If discovered and removed, job-site conditions can be changed to minimize the chance for error. This is more likely if people possess an intolerance for error precursors or error traps. Examples include reporting an improperly marked valve or a malfunctioning gauge in a safety system, taking a broken ladder out of service, immediately cleaning up an oil spill, stopping work until a change can be made to the procedure, calling in a replacement to relieve a worker who has become ill, seeking technical help when unsure, asking for a peer review on engineering calculations, routinely performing safety self-assessments, and so on.

Common Error Precursors

Error precursors are not mysterious or obscure. To the contrary, they are noticeable, even obvious, if people look for them. The error precursors listed below (in order of impact) were compiled from a study of INPO’s event database and from human performance, ergonomics, and human factors sources. These are the more common conditions associated with events triggered by human error. Some organizations distribute a plastic-coated error precursor card to their front line workers to carry with them on the job. Workers refer to these cards during pre-job briefings to help identify precursors related to the upcoming task. A more extensive list of error precursors and error precursor descriptions is provided in Attachments A and B of this chapter.

Remember, by themselves, error precursors do not define an error-likely situation. A human act or task must be either planned or occurring concurrent with error precursors to be considered error-likely. Several examples are provided below. For each example, notice the underlined action. Recall that an error is an unintended action.

Many different factors can affect performance. Considering the number and variety of factors involved with a specific job, many things can change, even with simple, repetitive tasks. Consequently, no work should be considered routine. When people believe a job is routine, they subconsciously think that “nothing can go wrong,” and they expect only success. This mind-set leads to complacency and overconfidence. Then, when something does go wrong, people tend to rationalize the situation away, inhibiting proper response in time to avert the consequences.95 Most events originate during routine activities. A sub-principle of human performance is there are no routine tasks.

ERROR-PREVENTION TOOLS

There are two ways to prevent human error from disturbing the facility or harming other important assets: either keep people from making errors (error prevention) or prevent the errors from harming the facility’s (controls). The design of systems, structures, and components aids in performing the latter through engineered controls such as physical barriers, interlocks, keyed parts, shaped/color-coded controls, automation, and alarms. However, the prevention of or detection of errors also depends on people, either the performer or other people. For example, self-checking and procedures provide individuals with the means of avoiding or detecting mistakes, while peer-checking and three-way communication engage another person. Human performance tools are designed to help people anticipate, prevent, and catch active errors.

Methods of controlling latent errors are designed more to catch them than to prevent them because, by definition, people are usually unaware when latent errors occur.

Engineering Human Performance Optimization in the Workplace Part 2 provides an explanation of numerous tools that individuals and work teams can employ to reduce errors. The fundamental purpose of human performance tools is to help the worker maintain positive control of a work situation; that is, what is intended to happen is what happens, and that is all that happens. Every person wants to do good work, to be 100 percent accurate, 100 percent complete, and meet 100 percent of the requirements. However, error is a normal characteristic of being human. Regardless of one’s intention to do a job well, errors still occur because of the inherent fallibility and variability of all human beings. On occasion, people still err despite how rigorously they use human performance tools. For this reason, we take the dual approach to manage controls as well as reducing error (Re + Mc = ØE).

System Changes

Although this course focuses on what people can do to reduce human error, it is recognized there is another whole dimension associated with error reduction. This involves improvements or changes in the engineered systems so the machines and working conditions better support the human needs, thus reducing human error. The location of instruments and controls on operating control panels, the accessibility and positioning of monitoring equipment, the lighting in passage ways, the sounds of warning alarms, the heights of working surfaces, the distance from communication sources, the number of work a-rounds present, and numerous other conditions can either enhance or hinder human performance. Human error is more likely when tools and equipment, procedures, work processes, or technical support are inadequate. Human factors professionals study and report on adverse engineered and management systems within an organization and recommend modifications or improvements to eliminate these and other conditions. Implementation of such recommendations improves worker perform and reduces human error.

Reporting errors and error precursors is an essential behavior needed to acquire feedback from the field about flawed engineered or management systems. Managers and supervisors should encourage workers to report adverse system-related conditions that promote error (error precursors) when ever they are encountered. With input from worker reporting, management can direct needed engineering and system changes. Reporting should be carried out in accordance with the organization’s reporting policies, procedures and practices. More will be said about how to encourage a reporting culture in Chapter of this course.

ATTACHMENT A – ERROR PRECURSORS

The conditions listed below were derived from an in-depth study of INPO’s event database and several highly regarded technical references on the topic of error. Many references refer to error precursors as behavior-shaping factors or performance-shaping factors. The bolded error precursors are more prevalent and are listed in order of impact. Other error precursors are not listed in any particular order.

ATTACHMENT B – COMMON ERROR-PRECURSOR DESCRIPTIONS

The first eight error precursors from the table on the previous pages are described below. These tend to be the more commonly encountered conditions that provoke errors. The error precursors for each category are arranged in order of influence.

Shopping Cart
Scroll to Top