Drilling into the Data Helps a Steel Company Reduce Downtime and Improve OEE
A steel producer was incurring significant unscheduled downtime due to equipment and processing issues. This was particularly problematic because the company was struggling to meet market demands.
The sold-out mill had above-average data capture. Their systems recorded lost-time events to the minute, including rate, and yield losses. They calculated Asset Utilization (AU) and Overall Equipment Effectiveness (OEE) daily and published weekly data trends. The mill’s organizational structure included a reliability engineering group comprised of mechanical and electrical engineers handling both maintenance and reliability engineering tasks.
While the mill’s operation had plenty of data, it wasn’t being analyzed in a way that supported continuous improvement. Life Cycle Engineering worked with mill management and the reliability engineering group to focus on improving OEE by focusing on loss elimination. The mill’s objective was to reduce their unscheduled equipment downtime, which was running at 9%, by 50% within three years. Because of the mill’s sold-out status every minute of reduced downtime would correspond to an increase in OEE and production volume.
Analyzing Historical Data Reveals Bad Actor Equipment and Yields Improvement Strategies
The mill’s weekly reports provided an overall result for AU and OEE, the actual production volume, and categorized losses by many different sub-groups associated with time, rate, and yield. The data for the calculations were derived from the process control system data historian. In evaluating the reports, LCE’s team discovered that there was no drill-down capability to determine which mill process section or specific pieces of equipment were contributing the most to unscheduled downtime.
LCE engineers partnered closely with the site’s reliability engineering team to carefully analyze the data and drive appropriate, targeted strategies to decrease downtime. The effort began with accessing three years of loss data from the data historian. The data had each event, duration and the categorization attributes based on how Operations recorded each event. One of the key data attributes was equipment description. The majority of events had an equipment description, but none had the CMMS equipment number.
Next, the team used the historical data to determine the ‘bad actor’ equipment or system. The team broke the plant down into several major process sections, based on block diagrams and hierarchy. Within each major section, the team manually calculated the amount of downtime loss, and developed Pareto analysis to identify the top 10 equipment or system causes of downtime. The event tracking data enabled the team to calculate mean time between failures (MTBF) based on downtime events, and mean time to repair (MTTR) using the average amount of downtime, which includes repair and start-up time.
With each mill section’s bad actor list, the reliability engineers performed the difficult work of analyzing why each specific equipment or system was failing at sometimes alarming rates. As is typical when drilling down to determine root causes, the issues varied widely in scope and complexity. At one end of the spectrum, PMs were incomplete due to human factors. In many cases, it was a default run-to failure strategy when either a time-based or condition-monitoring approach was appropriate. Mid-spectrum issues required low-cost equipment modifications to improve equipment life. At the far end of the spectrum were problems requiring significant modifications and capital expenditures. In one case, a $300,000 redesign eliminated several failure modes. That piece of equipment went from the top of the bad actor list to having zero downtime for the following year.
Sharing Accurate Data Builds Team Consensus to Drive Improvements
The team also educated mill leaders, equipment operators and maintenance staff about bad-actor problem equipment. Mill leaders created small teams for several mill sections, with reliability engineers as team leaders. Each team included a maintenance supervisor, a maintenance planner, and an operations supervisor. The sections conducted monthly status meetings to provide improvement effort updates, educate staff, and address any roadblocks. These meetings prompted two cultural changes. First, the meetings educated staff on the top problems. Second, the meets relied on using data (not opinions) to identify and address problems causing downtime, including looking at data beyond the last seven days.
As the monthly meetings became the norm, reliability engineers were spending significant amounts of time each month to manually update files and calculate losses. To improve efficiency, mill leaders approved a project for LCE to develop an automated monthly analysis report. In addition to reducing the engineers’ transactional work, the automated report is accessible across the mill. Mill staff now has access to the data and Pareto analysis and doesn’t depend on an RE for the results.
Two Years of Dedicated Effort Yields a 30% Reduction in Unscheduled Downtime
After two years, the mill had reduced unscheduled downtime by 30%, and was well on its way to its goal of a 50% reduction. The preventive maintenance program was optimized with a focus on 20% of PM tasks that drove the majority of labor hours expended. Condition-monitoring technology was applied more widely and replaced time-directed maintenance activity when applicable. The reliability engineering group’s dedicated focus on improving the mill’s OEE yielded significant financial results for the company.
To learn more about LCE’s Asset Productivity Consulting Services,
please visit our website