Pinpointing the Root of the Problem
"My system did what it was told to do. Go talk to the other vendors" is not the answer you expect to hear from your suppliers when something goes horribly wrong. Finger-pointing each other is not going to get you any closer to resolving the problem. But if you already accepted delivery as-is, you are in fact headed to this deadlock.
A few years ago I was sent offshore Eastern Asia to investigate an incident that had a occurred a couple of days prior. The culprit system had been recently upgraded, and even though I was not an expert in this system at the time, my knowledge and expertise in integrated industrial control systems was exactly what the client needed.
Upon arrival to the facility, I established contact with the Chief Electronic Technician (Chief ET) onboard, whom I would be teaming up with as the investigation ran its course. As we interviewed personnel present during the failure, it appeared that as torque applied by the rotating machine was reaching its limit, the column stopped rotating. However, instead of holding the column in its current position, it began to slowly spin in reverse. Without changing any of the parameters, the operator pressed the torque release button, which was supposed to slowly drop torque. Shortly after pressing the button however, the column began to spin uncontrollably in reverse at about 270 RPM before coming to a stop.
Luckily, the column did not come apart, and no injuries were reported. However, interviews with personnel revealed this was not the first time the issue had occurred. In fact, this issue occurred shortly after the upgrade was commissioned, and was thought to have been addressed then. The most recent event (the most alarming one) was the fifth recorded occurrence.
When investigating the equipment configuration, the first thing we noted was that the configuration was not standard. Instead of upgrading the torque machine and the variable frequency drive (VFD) as a package from a single vendor, the client had procured the items from separate vendors independently to reduce cost. Moreover, the client did not appoint either an internal or independent resource to oversee the installation and integration of the two components, and instead relied on the vendors to "figure it out" once onboard.
This information directed us to dive deeper into the configuration and parameter settings of the devices. When we extracted the trend curves for each of the measurements at the time of the incident, we could clearly see that the torque provided by the VFD reached its peak, but did not adjust output to prevent the column from spinning in reverse (presumed to be caused by stored torque in the flexible column). Once the slow reverse spin began, the VFD detected that it was unable to rotate in the direction it was commanded to turn, and so it released the column to prevent an overload (based on the parameter settings), causing the uncontrolled spin in reverse.
It was evident from the findings that the incident was caused by inadequate configuration and parameter settings during integration of the systems onboard. However, this was not the root cause. When vendor representatives were called to a meeting to discuss the findings, the root cause emerged.
The two vendors did not normally work together, because their respective products were usually packaged as integrated systems with other vendors. The issue is not that the packages are different, but rather the functional mechanization of the integrated systems is radically different. The torque machine vendor is used to relying on the VFD to intelligently adjust its settings to provide the output it is commanded to provide (i.e., provide the torque and hold at max until commanded to controlled release). The VFD vendor however claimed their VFD is "dumb", in that it relies on commands from the torque controller to adjust its output.
The root cause for this incident (and repeated instances in the past) was that neither of the vendors understood how the other's systems were designed to operate. And instead of coming together and verifying the equipment was integrated properly once installed, they assumed each other's equipment worked the same way as they were accustomed to with their respective suppliers.
An industrial facility is not a collection of independent systems, but rather an integrated system of systems. How each of these systems interface with each other is just as important (if not more) as how each of the individual systems are designed. You should not accept and cannot afford preventable failures during critical operations. You can afford and should encourage assurance from adequate oversight during integration, whether via internal resources or an independent third party.