Problem-solving and particularly the part that focuses on root cause analysis (RCA) has always been one of the topics that has had my special interest. Specifically, two questions always slumbered in my head, viz. (1) whether you could speak of one root cause, or that you should speak of multiple (root) causes; and (2) whether you should speak of the root cause or rather the root condition. In a series of six posts, I will try to explain how rigorous problem-solving logic (using an example) can help us answer these questions. At the same time, I hope the example and the logic will be of use in your problem-solving efforts or your coaching thereof. This fifth post will again dive into the causal event chain, but now at the systemic level. And I will discuss the problem of people not adhering to the standard.
In the first post, I set the stage by defining the starting points and some basic initial concepts (like problem, cause, agent, target, event and Tripod Beta’s causal diagramming technique). I also introduced an example that I will use throughout the series. In the second post, I added some more concepts (necessary condition, defensive and control barriers). The third post traced back the causal event chain while introducing and applying concepts like causing events, the initial causing event, and the initial active cause. In the fourth post, we descended to the systemic level of our problem. We further explored the concept of barriers and made the link to standards in Lean thinking. And I introduced the problems of occurrence and non-detection and how this shows us the direction in which to find the root of the problem.
In this post, I will again dive into the causal event chain, but now at the systemic level. And I will discuss the problem of people not adhering to the standard.
Further Tracing Back the Causal Event Chain
So, we have seen that you trace back the event chain until you do not find any further initial active cause upstream. You then check the barriers (controls and defenses) at that level of the chain. When you only find (partially) unavailable of inadequate standards, you move to the systemic level. When you observe any non-adherence problems, you first stay at the specific level and continue your investigation.
The problem with which we restart our investigation is the problematic situation where the wrong flatbed was in use during the loading of the pipes. The causing event that led to this situation was the taking of the vehicle. The active agent in this selection was the team leader of the loading team; the passive object was the vehicle.
The cause of the wrong vehicle being taken was the team leader as part of his normal preparation. After the investigation, it was found out that he was working off a schedule that listed the wrong vehicle for this job.
Looking further upstream, it was found that the scheduler selected the wrong vehicle when he was scheduling the loading activity for the day. This led to the wrong vehicle being in the schedule. It was concluded that this was the initial causing event in this causal event chain as there was no problem with either scheduler or schedule before the selection event.
In terms of barriers (focusing on the initial causing event for now), it was quickly clear that there was a correct standard for the vehicle selection, but that it was not respected this time (standard was not adhered to).
There was also a problem of detection. There was no input check in the scheduling system (a system poka-yoke or error-proofing mechanism) that could have detected the wrong vehicle for the job. This is a case of an unavailable standard (a systemic problem).
This gives us the following causal diagram for the wrong vehicle being in use, which ultimately (triggered by the initial active cause of the rolling pipes) led to the head injury:
Figure: causal diagram for the wrong vehicle being in use during loading.
In this causal diagram, we would call the schedule with the wrong vehicle the initial active cause. However, it would be confusing to have two initial active causes for the same problem at a different level. Furthermore, upstream of the first initial active cause (the rolling pipes), there was no cause, only a necessary condition (the wrong vehicle). Also from this viewpoint, it would be strange to have an initial active cause further upstream. I, therefore, propose the name necessary condition cause for an initial active cause leading to a necessary condition in a later part of the causal chain.
Non-Adherence to Standards
Before we move on to the next step in our logic, let’s first explore the non-adherence problem. Non-adherence has been characterized as an event at the specific level, and we would like to understand whether there are any underlying systemic problems that need our attention.
Why did the scheduler not follow the standard?
(A) Was it because of an issue related to the standard? For instance, is it too difficult to select a vehicle when creating the schedule? Does it take too much time? Is the system user-friendly and clear? Is the standard displayed or readily accessible?
(B) Or was it related to the conception and introduction of the standard? For instance, was the scheduler involved in defining the standard? Was the scheduler trained and coached in the vehicle scheduling standard? Were the consequences of scheduling a wrong vehicle explained and the scheduler aware of the risks?
(C) Or could it be an issue related to the way standards are managed and how this could impact the scheduler? For instance, is the adherence to standards checked? Are there any management routines to verify the respect for standards? And when checked, is there any feedback?
(D) Or was it because of more environmental, situational or even personal conditions? For instance, did it happen at the end of the day, or under time pressure? Was the scheduler distracted by something else? Or did he confidently assume he knew the standard? Did any personal circumstances lead to a lack of concentration? Or did the person even deliberately select the wrong vehicle?
After the team discussed with the involved scheduler, it was found that although he was trained in the standard, the standard was not visibly displayed and actively used when scheduling. After a while, most schedulers work based upon experience.
One could argue that, but for some, all are still issues related to the company’s system of work (i.e., the standard for vehicle selection). You could even state that they are also a sign of an inadequate standard (not so much in the sense of incorrect, but in the sense of impractical or without any guarantee of preventing non-adherence). TriPod Beta refers to problems of this level as a pre-condition. A pre-condition is a state that influences the agent to not or poorly perform a task. What other environmental, situational or psychological pre-conditions may have existed at the time of the event that made the scheduler make this decision? An inadequate standard can very well be interpreted as such a pre-condition.
Now, just as when we asked ourselves the question of how it could happen that a wrong vehicle was in place during loading, we could ask ourselves how it could happen that an inadequate standard was in place during the vehicle selection. If you continue that line of thought you arrive at yet another level of abstraction: the system with which we plan, deploy, check and follow-up on the use of standards. Here we might identify problems related to how we define and document standards, if and how we test them, how we involve personnel in developing standards, how we train and coach our people in our standards, and how we check upon compliance with standards (e.g., through gemba walks or management routines, by self-, team- and hierarchical audits or process confirmations) and if and how we continually improve our standards (including properly managing changes and ensuring everyone is updated on these changes).
I would not refer to this as our system of work though. Although I would not say this is unimportant, the potential counter-measures related to that level will become too general, and less related to the specific problem with which we started our investigation. We do, however, need to draw our lessons from these cases and act accordingly (re-think the way we come to and manage our system of work).
Still, we could and should act upon the possible findings that are related to the specific standard that played a role in the non-adherence event that you identified. For instance, you could try and improve the clarity, visibility and practicality of the standard, re-train and coach people and raise awareness, start or intensify the process confirmation checks, etc. as to minimize the risk of non-adherence. There can, however, never be a guarantee unless we thoroughly error-proof the standard through a control poka-yoke (and still…).
Next and Final Post: Continuing the Logic
In the first post, I set the stage by defining the starting points and some basic initial concepts (like problem, cause, agent, target, event and Tripod Beta’s causal diagramming technique). I also introduced an example that I will use throughout the series. In the second post, I added some more concepts (necessary condition, defensive and control barriers). The third post traced back the causal event chain while introducing and applying concepts like causing events, the initial causing event, and the initial active cause. In the fourth post, we descended to the systemic level of our problem. We further explored the concept of barriers and made the link to standards in Lean thinking. And I introduced the problems of occurrence and non-detection and how this shows us the direction in which to find the root of the problem.
In this fifth post, we again went into the causal event chain, but now at the systemic level. And we discussed the problem of people not adhering to the standard.
In the sixth and final blog post, I will summarize the analysis, say something about prioritizing counter-measures and conclude on the two questions and apply this to the case we are analyzing. I will end the series with some words on how this can help you in your problem-solving efforts and your coaching of problem-solving teams.