To access the document, please click on the attached PDF file titled "Failure Finding Tasks." Unfortunately, the file was created using Microsoft Office Document Image Writer, which may not be the best option for viewing. If you do not have Adobe Acrobat Reader installed on your computer, you may encounter difficulties opening the file. The document size is 66 KB.
- 21-10-2024
- Shawn Thompson
In the event outlined in the Failure Finding Task report, it is important to illustrate how the malfunction linked to pump B becomes apparent. When pump B is in operation and unexpectedly ceases to function, pump C seamlessly takes over without any operational disruptions. This could potentially lead to the operator overlooking the failure of pump B due to the seamless automatic transition between the two pumps.
In a real-life situation, a battery charging room employs two redundant exhaust fans to eliminate explosive fumes. While one fan runs constantly, the other is intended to activate in case the primary pump malfunctions. These belt-driven exhaust fans are positioned in the ceiling, with no scheduled maintenance due to redundancy. During a routine inspection, I noticed a noisy belt and requested further investigation. The mechanic discovered that the standby fan belt was deteriorating, while the duty exhaust fan belt had already failed. This incident highlights the concept of a "hidden failure" in the event of a duty fan belt malfunction.
Detecting hidden failures necessitates implementing preventive maintenance measures focused on identifying, diagnosing, and addressing issues as needed (fault identification PMs).
For seamless duty and stand-by operations with automatic switchover, it is advisable to include an indicator or alarm system that notifies the operator when the stand-by equipment is activated, indicating a failure in the duty equipment. Thank you, Shelley.
quote: The lack of preventive maintenance in this situation is concerning due to the redundancy. It appears that the fans are left running constantly without any oversight. Opening the door to a room filled with gas could potentially be dangerous as it may allow oxygen to enter or release harmful substances. This oversight could have serious consequences, and I find it to be more reckless than redundant.
When it comes to the run-to-failure approach in equipment maintenance, it's important to note that it doesn't mean neglecting regular checks until something catastrophic happens. Preventive maintenance, such as oiling and cleaning, should still be performed periodically. This strategy involves keeping minimal parts and consumables in stock and inspecting the equipment every few months to ensure it's still operational. This proactive approach allows for timely repairs or replacements in case of a breakdown. While run-to-failure may work for certain components like wiper blades, it's not advisable for critical parts like brakes or the engine of a car. In many places, there is a legal requirement for annual vehicle inspections conducted by authorized entities.
Erik, if your setup is like mine, I agree with Shelly that it's important to make any system failures noticeable. Otherwise, as Eugene mentioned, regular maintenance is necessary.
In the event that the B pump stops while in operation, is there a visible indication for the operator to determine if it is running? Time is not a factor when determining if a failure is apparent. If the operator can determine whether a failure has occurred at any point after the event, then it is considered an evident failure. Hidden failures, on the other hand, cannot be detected by the operator's senses or instruments without a subsequent event triggering a reveal, such as placing a demand on a pressure relief valve or gas detector.
Shelley recommends incorporating an indicator or alarm for automatic switch over duty/stand-by arrangements to prevent potential hidden failures. It is crucial to be aware of the possibility of the alarm or indicator failing to operate effectively.
svannels raised concerns about the possibility of theft, prompting the need to revise the failure codes in SAP's dropdown list. This action is crucial for maintaining data security and integrity.
quote: As highlighted by Vee, it's important to consider the potential for the alarm or indicator system itself failing to function properly. By incorporating an alarm along with regular PM or Calibration inspections, you can potentially save money in the long run compared to dealing with the consequences of a malfunction in the main system. It's crucial to stay proactive in maintaining all components of your system to prevent failures and ensure optimal performance.
The clock is the one piece of equipment that remains untouched because it sets the pace for everyone, constantly being monitored by all.
- 23-10-2024
- Rebecca Murphy
I am astounded by the uniformity of plant species worldwide and within various industries.
- 23-10-2024
- Jessica Freeman
According to Rod, the exhaust fans in the ceiling are belt driven, posing potential hazards such as loose belts catching fire. While there is no preventive maintenance due to redundancy, it is essential to consider the safety implications. Belts typically become slack over time and require periodic tightening. The decision to forgo preventive maintenance may seem unusual, especially in light of safety concerns. Redundancy can streamline maintenance efforts, but it is crucial to reassess the decision if safety implications are at stake, particularly in explosive environments where belt drives may not be the most suitable solution.
According to Darth, in order to prove its economical value, it is important to understand that the issue goes beyond just economics. Relying solely on an alarm to alert us to problems can be risky, as the alarm itself may fail unnoticed unless we consistently test and maintain it. This hidden failure could have disastrous consequences if left unchecked.
Vee, after considering your perspective, I appreciate the fact that adding another element introduces another potential point of failure to address. Ultimately, the key factors to consider are: which configuration is superior in terms of efficiency, cost-effectiveness, reliability, and ease of maintenance planning? One option includes equipment A, redundant equipment B, and an alarm system to alert when A fails and B takes over; while the other option only includes equipment A and redundant equipment B without an alarm system. Is it justified to add an extra component to the system for early detection of equipment A failure? And does adding an alarm system ultimately decrease the overall reliability of the system?
Darth once said, "Is early detection of equipment failures enough reason to add another component to the system? Will adding more alarms decrease overall reliability?" These are important questions, and the answer is actually NO. Before we consider adding new equipment, let's explore other options.
1. A toothed belt drive is more reliable than a V belt drive as it prevents slipping and potential fires. It also reduces bearing loads.
2. Performing a belt tension check as part of a preventive maintenance routine can help identify potential safety hazards. Additionally, greasing bearings, cleaning motor cooling fins, tightening foundation bolts, and inspecting the area can help determine when belts need replacement. These are cost-effective measures with potentially high benefits (such as avoiding fires).
My concern lies with offshore subsea installations, where equipment is often neglected under the assumption that if it's out of sight, it's out of mind. This mindset can be dangerous, whether it's neglecting exhaust fans on ceilings or pipelines in water-filled trenches. Each new alarm added to the system requires a failure-finding process in the CMMS and has the potential to generate false alarms. Operators may become complacent if they rely too heavily on alarms.
In situations where there are numerous false alarms, it is crucial to focus on basic maintenance tasks before jumping to complex solutions. Following the logic of Reliability Centered Maintenance (RCM), alarms should only be added once all other options have been exhausted. I am not against alarms; however, they should be implemented thoughtfully as each solution could create new issues.
Thank you, Vee. So, to clarify, you are suggesting that our first course of action should be to consider equipment redundancy, lack of alarm systems, and basic preventative maintenance as the initial solution to explore.
Hey Darth, I want to emphasize that there is no one-size-fits-all solution in this situation. In Rod's scenario, it's difficult to determine if both fans are working properly. There are no clear indications, such as an ammeter or alarm light, for the operator to know the status of Fan A, Fan B, or both. This is different from easily accessible equipment like pumps. In Rod's case, any malfunction remains hidden until the fumes themselves reveal the issue. It's vital to not overlook the potential secondary effects of failures, especially when safety is at stake, despite the belief that a backup system will automatically take over. For instance, a fan belt failure may not disrupt production immediately if the backup kicks in, but it could lead to a fire risk if the V-belt slips unnoticed. This safety hazard is unacceptable. Furthermore, if we can't determine which fan is operational or if any fan is running at all, there's a risk of explosive vapor accumulation. Therefore, installing indicator lights is a cost-effective and practical solution to track fan operations, but it's not foolproof. Regular maintenance, like tightening V-belts, is still necessary to prevent potential accidents. These concerns wouldn't exist if the fans were only handling non-explosive mixtures. I hope this clarifies the distinction for you.
Implementing redundant systems is essential for effective preventive and predictive maintenance. Neglecting equipment until it fails is negligent. Redundant systems eliminate excuses for not properly maintaining operating equipment. By utilizing these redundancies, significant resources can be saved quickly. Non-life safety systems should require manual recovery from failures to avoid potential dangers. It is imperative to promptly address all failures and take necessary actions for recovery and repair.
Having an excessive number of duplicate systems can lead to another issue: a false sense of security. When essential parts are not readily available, people may rely on back-up systems that were reported as inoperative months ago without realizing the potential consequences.
Steve makes a valid point about the importance of focusing on critical equipment in plant design. It's common to see plants with excessive redundancy and bypasses that actually hinder performance. In one plant, despite all the additional features, it couldn't operate at more than 80% of its capacity because no equipment was considered critical. This approach led to a lack of accountability and measurement of key metrics. However, after implementing changes to hold people accountable and track losses, the plant's performance improved significantly. This highlights the importance of avoiding over-engineering and focusing on essential equipment in plant design. The Japanese success in manufacturing can also be attributed to their lean manufacturing principles, such as eliminating waste and adopting a just-in-time approach. This strategy helped identify and address issues without relying on buffer stocks. Ultimately, a shift in mindset towards prioritizing critical equipment and efficiency is crucial for driving success in plant operations.
It appears there are too many alarms being designed and installed, which calls for an alarm rationalization exercise to be conducted. This will help streamline the alarm system and ensure optimal efficiency.
- 23-10-2024
- Heather Coleman
Hey Josh, I want to clarify that the items I mentioned in my post weren't alarms, but rather physical equipment put in place as backups. In many cases, two pumps were installed instead of one when only one was needed, and three pumps were installed when two were needed. - Steve
Although there are various scenarios at play, they all share one common theme: being overdesigned. Was there not an instance of an overly cautious approach in addition to an excessive amount of redundancy and bypasses in the case you mentioned?
There was no sense of urgency in response to the situation, highlighting an accountability issue and a lack of proactive attitude. Even with multiple alarms sounding, there was a failure for anyone to take action.
Cases of over instrumentation have been reported in our facility, specifically with our kiln that operates within a specified temperature range. We utilize eight temperature probes that provide data to the control center regarding the kiln temperature. These probes are calibrated to 0.1% accuracy every three months. Interestingly, the operator manually checks the product at the outfeed conveyor to determine its moisture level. If the product is too dry, the operator decreases the kiln temperature, and if it is too wet, the temperature is increased. The parameters for the infeed are not fixed, leading to variability in the process. Despite this, the operator does not prioritize the kiln temperature accuracy within 5%, even though the probes are calibrated to a much higher precision. The situation makes one question, "Beam me up Scotty Doh?"
I came across a fascinating article by "Vee" that suggests staggering duty-standby equipment in a 7-weeks on and 1-week off schedule can lead to significant improvements. By implementing this strategy, there could be a notable increase of 10-12% in Mean Time Between Failures (MTBF) and a corresponding decrease in maintenance costs. Many redundant systems in industrial plants usually allocate time evenly between duty and standby. However, after reading the article, I am inclined to agree with "Vee" that this unconventional approach may offer valuable benefits.
Performing a visual inspection during each switch over provides six chances annually to maintain and care for the units.
Hello! I came across an intriguing article by "Vee" discussing the benefits of staggering duty-standby equipment cycles. By alternating between 7 weeks and 1 week of operation, it could potentially increase Mean Time Between Failures (MTBF) by 10-12% and reduce maintenance costs.
I recommend implementing a strict duty-standby schedule whenever possible, such as 8 weeks on and 1 day off. The specific interval can vary depending on factors like the failure rate of standby equipment and system availability requirements. In cases where a pure duty-standby setup is challenging, consider uneven operation like 7 weeks on and 1 week off.
By adjusting the operating schedule, the number of starts can be reduced significantly. This can lead to a 50% decrease in seal failure rates, particularly in industries like oil and gas where seal failures contribute to a significant portion of pump failures. According to OREDA data, seal failures make up about 22% of pump failures in the industry.
By reducing the number of starts, it is possible to see a decrease in overall pump failure rates. In theory, pump failure rates could drop from 100% to around 89% with a properly implemented duty-standby regime. This theory has shown promising results in practice as well.
When incorporating the FMEA approach and RCM philosophy, it is essential to ensure that the analysis does not compromise the reliability of operations. Despite the plant having ample redundancy, the requirement for continuous operation signifies the importance of prioritizing reliability efforts. Leveraging the FMEA process to generate proactive maintenance tasks for equipment is crucial for enhancing reliability. It is imperative to implement predictive maintenance techniques extensively and deeply. Conducting RCM analyses can reveal redundancies and identify potential weak points in reliability, especially for operations that are critical. Balancing wear across equipment families is vital, as is avoiding a run-to-failure approach. Improving reliability at the component level through thorough FMEAs and actionable tasks strengthens overall plant reliability. Consider reducing preventive maintenance frequency after implementing vibration, thermography, and ultrasonics maintenance programs. Best of luck in your efforts.
skm, you recommend considering if the operation is evenly distributing wear across all equipment in a fleet. Are you proposing a 50:50 running ratio?
- 23-10-2024
- Vanessa Carter