Problem-solving and particularly the part that focuses on root cause analysis (RCA) has always been one of the topics that has had my special interest. Specifically, two questions always slumbered in my head, viz. (1) whether you could speak of one root cause, or that you should speak of multiple (root) causes; and (2) whether you should speak of the root cause or rather the root condition. In a series of six posts, I will try to explain how rigorous problem-solving logic (using an example) can help us answer these questions. At the same time, I hope the example and the logic will be of use in your problem-solving efforts or your coaching thereof. This fourth post will take us to the systemic level of our problem. I will further explore the concept of barriers and make the link to standards in Lean thinking. At this point in the logic, I will also introduce the problem of occurrence and the problem of non-detection.
In the first post, I set the stage by defining the starting points and some basic initial concepts (like problem, cause, agent, target, event and Tripod Beta’s causal diagramming technique). I also introduced an example that I will use throughout the series. In the second post, I added some more concepts (necessary condition, defensive and control barriers). The third post traced back the causal event chain while introducing and applying concepts like causing events, the initial causing event, and the initial active cause.
This fourth post will take us to the systemic level of our problem and I will further explore the concept of barriers and make the link to standards in Lean thinking. At this point in the logic, I will also introduce the problem of occurrence and the problem of non-detection and how this can help us in finding the root of the problem.
From Initial Active Cause to Barriers
When we arrive at the initial causing event and have determined the initial active cause, it is time to turn our attention to the barriers related to the initial causing event. As a reminder, a barrier prevents causing events to happen and may stop the whole chain of events.
In our initial causing event, the pipes that start rolling could have been prevented by having the stack of pipes being secured using stakes in the stake pockets, and/or using straps (or chains) for each individual pipe or each level of pipes, and/or using wedges under the pipes. These are all examples of defense barriers.
Similarly, the event could have been prevented by controls surrounding the loading activity. For instance, when personnel would have checked the required vehicle in the loading standard (assuming this was available), they could have stopped the loading as they would have found out that the vehicle did not have any means to secure the pipes. Of course, we assume here that they would have respected such a standard procedure.
Available, Adhered to and Adequate Barriers
The example shows that to give rise to the initial active cause or problem (i.e., the rolling pipes), barriers that could have prevented the initial problem should either been missing or not available, inadequate (i.e., not fit-for-purpose) or not respected.
As discussed before, in Lean thinking, our acts are guided by our standards; our system of work. When we are confronted with a gap-type of problem, and we arrived at the initial causing event, we should, therefore, consider the standards related to this event. The standard can be seen as a barrier (a control or defense), preventing a problem. And in our investigation, we should check whether the standards for the initial causing event were available, adhered to and adequate.
In RCA these are sometimes referred to as the three A’s (which I described in my 2013 blog post here: http://dumontis.com/2013/10/3-as-structured-problem-solving/). Similarly, you can also come across the acronym MIN (meaning missing, incomplete or not followed, proposed by Ivan Fantin in his 2014 book Applied Problem-Solving). Be careful though, as MIN is not exactly the same as the three A’s. Inadequate means that a standard for a specific element may well be available, but wrong (ineffective). Incomplete (partially missing) is different from inadequate.
When no standard was available, it was only a matter of time that such a problem would have happened. We first need to define and agree upon a standard then. When a standard was available and adhered to, but still the initial problem happened, it implies the standard was inadequate and that it should be improved. When a standard was available, but not respected during the execution, we should first go back to our standard as otherwise, we don’t know whether it is adequate. And we should, of course, find out why the standard was not followed.
In our initial causing event there was a standard prescribing the use of defensive barriers, specifically the use of a vehicle (a flatbed) with stake pockets and stakes. In reality, however, another vehicle was used without stake pockets and stakes. This is an example of a standard not being adhered to. (In an example where a flatbed with stakes would have been used, but the stakes wouldn’t have prevented the rolling of the pipes, it would have been an inadequate standard).
At the same time, there was no specific prescribed pre-loading check with a related checklist. This is an example of a missing (or incomplete) standard.
Again, extending TriPod Beta’s causal diagramming technique a bit, we will use the following symbols for barriers (i.e., standards) that respectively were (1) available and adequate, (2) available but inadequate, (3) not adhered to and (4) (partially) unavailable.
Figure: depicting problems related to standards (barriers).
Systemic Problems
In TriPod Beta, all three of the mentioned cases (unavailable, not adhered to or inadequate) are called “immediate causes” (or “active failures”). As may be clear, I’m not very satisfied with this wording as we already introduced the concept of an “initial active cause” before.
As problems were defined as an unwanted or undesirable situation, all three situations described by the three A’s can also be considered problems. And we could, of course, ask ourselves the same questions now. What are the respective causing events of the unavailable standard, inadequate standard, or the standard not being adhered to? The unavailable, missing standard was never created when it should have been created. The inadequate standard, although having been created, was not fit-for-purpose.
The problems of the (partially) unavailable or inadequate standard are at another level than the specific and actual problem triggering the investigation. They are all related to standards, and standards materialize our system of work that we are trying to improve. So, through the actual problem, we now arrived at what I will refer to as systemic problems. Systemic problems are related to (partially) unavailable or inadequate standards. And they are conditions or states, not events.
The case of non-adherence to a standard, however, is different. This is not a condition or a state, but an event that happens at the specific level and is not systemic. This problem requires more investigation at the specific level.
The observation that another vehicle than the standard vehicle was used is an example of a standard not being adhered to. It is at the specific level and requires further investigation as to the causes of this non-adherence.
The absence of a prescribed pre-loading check and checklist can be considered a systemic problem. In case of an inadequate standard, we would also have been dealing with a systemic problem.
The Problems of Occurrence and Non-Detection
In our example, like in all other problems, we should distinguish between two types of barriers. One is the barrier that would prevent the actual possibility of the occurrence (when adhered to). I will refer to this as the standard for non-occurrence. The other is the barrier that prevents the occurrence not because it cannot happen, but because the risk for the occurrence was detected before it actually happened. I will refer to this as the standard for detection.
The fact that a wrong vehicle without stake pockets and stakes was used in the situation is an example of a problem of occurrence. The absence of a proper prescribed pre-loading check can be considered a problem of non-detection.
As checks can be considered a necessary evil, or even as waste, together with the ambition to do things first time right, the problem of occurrence is the first problem we should try to solve. Visual control should then help us detect a deviation from the (available and adequate) standard. In a good problem-solving effort, both the problem of occurrence and the problem of non-detection should be eradicated.
Next Post: Continuing the Logic
In the first post, I set the stage by defining the starting points and some basic initial concepts (like problem, cause, agent, target, event and Tripod Beta’s causal diagramming technique). I also introduced an example that I will use throughout the series. In the second post, I added some more concepts (necessary condition, defensive and control barriers). The third post traced back the causal event chain while introducing and applying concepts like causing events, the initial causing event, and the initial active cause.
This fourth post took us to the systemic level of our problem. We further explored the concept of barriers and made the link to standards in Lean thinking. And I introduced the problems of occurrence and non-detection and how this shows us the direction in which to find the root of the problem.
In the next post, I will again dive into the causal event chain, but now at the systemic level. And I will discuss the problem of people not adhering to the standard.