Maintenance schedule by using reliability analysis: A case study at Jajarm bauxite mine of Iran

Category Mine
Group GSI.IR
Location 20th WORLD MINING CONGRESS 2005
Author Javad Barabady * -Uday Kumar
Holding Date 14 February 2006
Maintenance represents a significant proportion of the overall operating costs in the mining industry. The optimal maintenance scheduling can reduce the cost of maintenance and extend equipment lifetime. Since the cost of maintenance is very high, therefore, mining industry need to reduce maintenance cost reasonably meanwhile keeping system reliability and availability high. Despite the large cost of maintenance, mine management has only given passing attention to the optimization of the maintenance process. Performance of a mine production system depends on reliability and effectiveness of maintenance strategy of the equipment. An unplanned failure can result in significantly higher repair costs than a planned maintenance or repair. Of even more important is the loss of production associated with larger equipment failures. This paper is divided into two parts. First part introduces a methodology for optimal maintenance scheduling using reliability analysis and maintenance data analysis, in the form of time between failure and time to repair distributions. In the second part we present a case study from Jajarm bauxite mine of Iran to illustrate the effectiveness of the maintenance scheduling model.
Key words: Maintenance, reliability analysis, maintainability, mine.
The probability of equipment failure is influenced by engineering design and operating condition. The performance level resulting from this operating environment is expressed as inherent reliability. Without maintenance intervention, equipment will ultimately fail to perform its intended function and become less reliable. Therefore, maintenance must be managed as the supplier of reliable operational capacity of system. The productivity of the unit of mine is dependent on the control of equipment’s operational reliability. It is also an economic factor that influences on the economic life time of machines, on the number of stand by units, and on the number of operation and maintenance personal need for them, in other words on capital and operation costs. There is a need for a method to measure the effectiveness and the weaknesses of maintenance operation in order to focus the development of maintenance activities towards the enhancement of the business.  Influences of maintenance on profit can be:
·         Through the maintenance cost
·         Through the down time losses of production caused by machinery
·         Through the need of capital ( stand-by and tied-up capital)
Traditionally, Mining organizations have focused on the key measures of Plant Availability and Utilization to measure equipment performance. It was demonstrate that these measures alone are insufficient to make the informed decisions about equipment strategies. In practice, there is one factor, often overlooked, that has a significant impact on equipment performance and that is Equipment Reliability. A focus on reliability is critical to improve the Short-Interval Scheduling, and improving equipment performance. Figure 1 indicates the different timeframes with which operations and maintenance are planned and scheduled.  In the longer term, life of mine plans help to determine the quantity and type of equipment required for achievement of that plan (and vice versa), and thereby maximizing equipment availability and utilization. In the medium term, Operations plans interface with the maintenance plan in order to maximize equipment availability and utilization by:
·         Adjusting the planned maintenance start times due to changes in production schedules or shipping schedules.
·         Taking advantage of the maintenance windows as they become available.
·         Ensuring that preventive maintenance on critical equipment is carried out.
·         Ensuring that equipment is available for maintenance when planned.
The main objectives of this paper are:
·         To identify the reliability, availability, and maintainability characteristics of system.
·         To develop a maintenance strategy based on reliability analysis
Figure1: Timeframes of Maintenance and Operations Planning [Sandy, 1997]

f (T) ;   Probability density function (p.d.f)
η ;        Scale parameter
 ;       Shape parameter
T;         Operation time
t,          Repair time
R(T);    Reliability function
A(s) ;    System availability
;   Failure rate function
;   Maintainability function
;    Repair rate function
MTBF;   Mean time between failures
MTTR;  Mean time to repair
ECC;     Expected cost of corrective maintenance
ECP;     Expected cost of preventive maintenance
ECI ;     Expected cost of inspection
ECFi ;    Expected cost of failure in preventive maintenance strategy
 ;      Cost of an inspection
Cc ;      Cost of consequence of a Failure
Cr  ;      Cost of repair after a failure
Cp.m ;  Cost of restoration (of item’s resistance to failure)
Pf;        Probability of failure during one interval
Pp ;       Probability of inspection will catch potential failure
Pe;        probability of inspection will falsely indicate potential failure
;  Availability importance measure of component i based on MTBF
;  Availability importance measure of component i based on MTTR
The reliability is a technical characteristic of the system, depends on the ability of the technical solution to keep the system in operative state depicts that some fault has occurred. 
The reliability of a product is the measure of its ability to perform its function, when required, for a specified time in a particular environment (Leitch 1995). Reliability for given mining equipment can be expressed in term of Mean time between failures. For determination of the probability density function in this study, the Weibull distribution was chosen, due to its flexibility in representing components with constant, increasing and decreasing failure rates. The Weibull distribution is one of the most commonly used distributions in reliability engineering because of the many shapes that it attains for various values of .  It can therefore model a great variety of data and life characteristics. The 2-parameter Weibull probability density function is given by:
The Weibull reliability function is given by:
The Weibull failure rate function is given by:
·         For 0 < < 1, the failure rate decreases with time,
·         For = 1, it becomes the exponential distribution, as a special case, and the failure rate is constant,
·         For >1 the Weibull assumes wear-out type shapes i.e. the failure rate increases with time.
The failure rate of system is high in the beginning of the operation period, then decreases to a constant value and then as a final step, increases again.  A most generalized curve for the failures rates of components over time is the bathtub curve (Figure 2).  The downward curve on the left depicts a typical relatively high failure rate during the early part of any equipment life cycle. Early life cycle problems are often due to failures in design, incorrect installation, operation by poorly trained operators, etc. This period is often referred to as the ‘burn-in’ phase. The next stage of the cycle is called the ‘useful life’ phase of the system and it is often characterized by a constant failure rate. The third stage of the system life cycle is characterized by the ‘wear out’ phase where the failure rate typically increases and, in turn, there is an increasing need for more service and maintenance. Generally, most engineered items exhibit a definite wear-out pattern, i.e. they mostly fail around some mean operating age although a few fail sooner and a few later. So, it is practical to say that when running a system more maintenance and resources are needed in the phase1 and phases 3 of life but the phase 2 comparatively less maintenance and resources are required.
Figure 2: Bathtub curve of a system
When a piece of equipment has failed it is important to get it back into an operating condition as soon as possible, this is known as maintainability. The maintainability of a system is defined as the probability that it can be retained in or restored to a specified condition within a given time. The purpose of maintainability engineering is to increase the efficiency and safety and reduce the cost of equipment maintenance, when maintenance is performed under given conditions and using stated procedures and resources.  Maintainability is defined as the probability of performing a successful repair action within a given time. In other words, maintainability measures the ease and speed with which a system can be restored to operational status after a failure occurs. For example, if it is said that a particular component has 90% maintainability in one hour, this means that there is a 90% probability that the component will be repaired within an hour. In maintainability, the random variable is time-to-repair, in the same manner as time-to-failure is the random variable in reliability. Consider the maintainability equation for a system in which the repair times follows the weibull distribution, its maintainability M (t) is given by:
While the mean time to repair (MTTR) is given by:
 And the Weibull repair rate is given by:
To calculate the maintainability or Mean Time to Repair (MTTR) of an item, the time required to perform each anticipated repair task is multiplied by the relative frequency with which that task is performed (e.g. number of times per year).  MTTR data supplied by manufacturers purely repair time which assume the fault is correctly identified and the required spares and personnel are available.   The MTTR to the user include the logistic delay also include factors such as the skill of the maintenance engineers.
Maintenance can be defined as the activity required keeping the equipment in operating condition, continuing to have its original productive capacity. How can we do the level of service needed on a particular piece of equipment? What should be done during a maintenance activity? When is the best time to perform the maintenance? How often should maintenance be done on a piece of equipment?  These are the questions that are continually being asked by maintenance professionals.
Maintenance represents a significant proportion of the overall operating costs in the mining industry. Despite the large cost of maintenance, mine management has only given passing attention to the optimization of the maintenance process. The focus has remained on the optimization of mine planning and operations where all the low hanging fruit was picked years ago. Recent initiatives in the field of equipment maintenance have been in the area of remote condition monitoring. In order for an advanced maintenance technology to succeed it must have a strong philosophical basis and the supporting hardware and software infrastructure. Maintenance is the largest controllable cost in the mining industry and it related costs account for approximately 30 to 50 percent of direct mining costs (Lewis 2001). A significant amount of energy has been spent analyzing and optimizing key processes throughout the mine. However, very little attention has been focused towards the optimization of the maintenance process. Significant cost reductions and improvements in equipment reliability and performance will be achieved through the implementation of rationalized proactive maintenance approaches. By doing small changes in the maintenance strategy, the asset life length might be expended with a large return on investment.
Maintenance activity must be guided by a maintenance strategy, which maybe divided into Design-out Maintenance, Preventive Maintenance and Corrective Maintenance (see Figure 3 for an illustration of the different maintenance strategies). Design-out Maintenance aims at changing the design of the product or system, in order to eliminate, or reduce, the need for maintenance during the life cycle. However, Design-out Maintenance is not appropriate strategy for the mining equipment with its large infrastructural assets.
Preventive maintenance may be seen as the maintenance carried out at predetermined interval or according to prescribed criteria, intended to reduce the probability of failure or the degradation of the functioning of an item. This means that maintenance is performed before a failure is developed. The preventive maintenance can be divided into time-based preventive maintenance and condition-based maintenance. The time-based preventive maintenance is mainly applied to the non-repairable items which have a life distribution and its research and theory are established as a maintenance policy by Barrow (Barlow 1965). The condition-based preventive maintenance, also called prediction maintenance, is applied to the items where failure happens accidentally. It is necessary to present the optimal inspection period as a preventive maintenance policy to improve the reliability of facilities utilizing mean time between failures based on reliability statistical information. Condition monitoring has the potential to identify problems prior to failure. Early detection of equipment degradation will enable repairs to be scheduled thereby reducing costs and interruptions to production.
Corrective maintenance is the maintenance carried out after fault recognition, intended to bring back an item into a state in which it can perform a required function. This means that maintenance is performed after the fault of an item has been detected, in order to restore the item.

Figure3: Different maintenance strategies

The point in the failure process at which it is possible to detect that the failure is occurring or about to occur is known as potential failure. The general process of occurring of failures is illustrated in Figure 4 (Moubray 1997). It is called the P–F curve because it shows how a failure starts, deteriorates to the point at which it can be detected (the potential failure point ‘P’) and then if it is not detected and corrected, continues to deteriorate usually at a accelerating rate until it reaches the point of functional failure (‘F’). If a potential failure is detected between the point P and the point F, there are two possibilities namely, to prevent the functional failure and to avoid the consequences of the failure. On-condition maintenance is defined as checking items for potential failures so that action can be taken to prevent or avoid the consequences of the functional failure. On-condition tasks are so called because the items which are inspected are left in service on the condition that they continue to meet specified performance standards. The frequency of on-conditional tasks must be less than the P–F interval. On-condition tasks are technically feasible if:
·         It is possible to define a clear potential failure condition.
·         The P–F interval is fairly consistent.
·         It is practical to monitor the item at interval less than the P-F interval.
·         The net P–F interval is long enough to prevent or avoid the consequences of the functional failure.
Figure 4:  P-F curve (Moubray, 1997)
Many mining organizations are beginning to focus less on the traditional measures of equipment performance, availability and utilization, and are starting structured programs to address reliability issues. Maintenance Planning is the process of acquiring a system commencing with the identification of a need and involving the research, modification and evaluation activities. Poor reliability has a far greater impact on operating efficiency, and therefore unit operating costs, than it does on the traditional measures of availability and utilization.
·         The cost of poor reliability is, for the most part, hidden to most mining operations today.
·         When measured, the true cost of poor reliability in most mining operations is very significant.
·         Poor reliability also has an adverse impact on our ability to provide accurate short term forecasts for equipment operating hours.
·         Equipment services being performed unnecessarily early, with resulting increases in maintenance costs and downtime, or
·         Equipment services being performed late, leading to the risk of in-service failures and reduced equipment life, or
·         Equipment services being performed at short notice, in an unplanned manner, increasing the downtime associated with these services.
A detailed decision diagram is proposed in figure 5.  Figure 5 shows the maintenance selection method for a component or subsystem based on reliability characteristics. The first step for selection of better maintenance plan for each component or subsystem is to find the failure rate of it’s based on available data from the maintenance reports, failure observation , daily report, and etc, because the judgment for each type of maintenance strategy depends on the situation of component in bathtub curve. Fixed Time Maintenance (F.T.M) is used when the failure rate is constant, in the wiebull distribution of MTBF case it means =1. Preventive maintenance is used when the failure rate is increased and the Expected Cost of Preventive Maintenance (ECP) is less than the Expected Cost of Corrective Maintenance (ECC). Component Failure Frequency is showed the items that seriously affect the failure among the components of the mining machine by using the Pareto chart.
In the proposed methodology, availability is influenced by the equipment reliability and the maintenance process. The

Figure 5: Conceptual model for maintenance discission based on reliability characteristics
different considered times in the calculation of availability are given in Figure 6. The fault must be detected; this is not always as simple as it looks. The cause of fault must be diagnosed before action. The replacement must be handled or the repair must be well done. At least there must be a check on the function after the measure. 

igure 6: The different state associated with failure occurrence. The timescale is not real (Myrefelt, 2004)

MTD: Mean detection time
MTDE: Mean decision time is the time for diagnosing the fault, deciding which measure to take and the initiating the repair.
MFTT: Mean function test time
Using these definitions, the operational availability ( ) and the functional availability ( ) of system can be calculated by equation 7 and 8.
The operational availability is the probability that the system will be in the intended operational state. In contract to system reliability, it takes maintenance into account. Functional availability is the probability that the system will be in the intended operational with correct value of the operational condition variable. The operational availability is the basis of the functional availability but has to be completed with the demand to meet function criteria.
Crushing plants are used to reduce the size of ore in mineral processing plant. A schematic diagram of a crushing plant in Jajarm bauxite mine is shown in figure 7. The ore is hauling to crushing plant by truck from the mine. In first step the ore is moving to a primary screen with two levels and are divided into three parts i) less than 20 mm ii) between 20 mm and 10 cm iii) more than 10 cm. The ore with the Size more than 10 cm is going to primary crushing subsystem, which consist of two jaw crushers. The output of this phase plus the part ii ( between 20 mm to 10 cm) are  divided into two parts a) less than 20 mm b) more than 20 mm by secondary screen. In this stage the size more than 20 mm is moving to secondary crusher (cone crusher) which is working in closed circuit with secondary screen. The ore with size less than 20 mm is going to the end of process which is out of both the primary and secondary screens.
Figure 7: Schematic diagram of crushing plant
Assumptions for the calculation of reliability and availability in this case study are:
1.        The system is repairable
2.        The system is subjected to repair and maintenance.
3.        The weibull distribution is used for time between failure and time to repair.
4.        The studied function is assumed to be independent
5.        The mean time to repair is also included MTD, MTDE, and MFTT.
6.        The repaired components are as good as new.
7.        The system can be in working state but not functioning.
We have been divided the crushing plant into some subsystem such as primary screen, primary crusher, conveyer, secondary crusher and secondary screen, etc. Earlier, a preliminary study of the reliability characteristics of a crushing plant at Jajarm bauxite mine shows that the conveyer subsystem and secondary screen subsystem are the two most critical subsystems. In this part, we have selected the conveyer subsystem and secondary screen subsystem for improvement maintenance by using reliability analysis. The discussion and the results are based on the analysis of the time between failure and the time to repair of   both subsystems for a period of one year.  Based on the analysis, maintenance policies are also suggested. The results using the model to evaluate the data collected in this case study are discussed under following quantitative for both subsystem:
  1. the reliability
  2. the maintainability
  3. the availability
  4. the maintenance strategy
The data collected in this study were mostly based on maintenance reports and daily reports. The time between failure data and time to repair for period of one year were sorted and analysed.
Table 1 shows the result of the reliability assessment, which estimated the parameter of failure time distribution by using ReliaSoft’s Weibull++ 6 software. Table 1 includes failure occurrence trends for the conveyer and screen subsystem of the crushing plant.



Weibull Parameter











Table 1: Parameters of Weibull distribution for TBF
Figure 9 shows the reliability of the conveyer subsystem and the secondary screen subsystem with time. The reliability of the secondary screen subsystem is more than conveyer subsystem for different operation time.
Table 2, is the result of the maintainability analysis based on weibull distribution for MTTR. In the case study as mentioned before, the repair time is the time from failed point to start of the system again which means MTTR is included MTD, MTDE, and MTBF. From the table 2, it is found that the MTTR of conveyer system is more than the MTTR of screen system.


Weibull Parameter















Table 2: Parameters of Weibull distribution for TTR
Figure 9: The reliability as a function of the operational time for both subsystems
The maintainability of both subsystems is shown in figure 10. With a repair time of 1 hour, the maintainability of the secondary screen subsystem and conveyer system are 58% and 45% respectively which means there is 45% probability that the conveyer subsystem and 58 % probability that the screen subsystem will be repaired within 1 hour. Base on figure 9, the maintainability of the secondary screen subsystem for all level of time to repair is more than the conveyer subsystem. The repair time depends not only on the technical systems but also on the maintenance crew.
Figure 10: Maintainability versus time of secondary screen and conveyer subsystems
Functional availability of conveyer subsystem and secondary screen subsystem could be calculated by using equation 8. The availability of conveyer and secondary screen subsystem are 0,915 and 0,958 respectively.  Availability of each subsystem could be increased by reducing MTTR or by increasing MTBF. We define two availability importance measures that can serve as guideline for decision making in developing an availability improvement strategy. The first one is availability importance measure based on MTBF for subsystem i ( ) and other is availability importance measure based on MTTR for component i ()



Equation 9 and 10 are showed which one of both MTBF and MTTR have more effect on availability of whole system. For an individual subsystem based on equation 9 and 10:
For secondary screen and conveyer subsystems of the crushing plant MTBF>MTTR, therefore > indicating that decreasing MTTR provides greater marginal benefit. However, the investment required to decrease the MTTR may be much grater than that required to increase the MTBF. Cost trade-off is essential for making final decision.
Based on figure 4 two requirements must be met in order to the preventive maintenance of a component to be appropriate:
·         First, preventive maintenance makes sense when the component gets worse with time. In other words, as the component ages, it becomes more susceptible to failure or is subject to wear out. In reliability terms, this means that the component has an increasing failure rate.
·         The second requirement is that the cost of preventive maintenance must be less than the cost of corrective maintenance.
The failure rate for both secondary screen and conveyer subsystems of crushing plant give a shape parameter more than one which means the failure rate will be increasing with time. For the control of second condition the cost tread-off is essential.
If a component of each subsystem allowed to run to failure, the expected cost of the corrective maintenance (ECF) over a given time interval includes the cost of the consequence of failure ( ) and cost of repair ( ) multiplied by the probability of the failure occurring ( ):
When regular inspections are conducted, the cost associated with the ECC is change. Now, in addition to the above formula, the probability of the failure being detected ( ) is incorporated. The expected cost of failure when preventive maintenance (ECFi) strategy is applied could be calculated by: 
In other words, ECFi = ECF * (Probability that inspection will not detect the potential failure). Other costs considered in this decision include the cost of the inspections, timely and untimely restoration of the equipment. For the expected cost of failure must be added the Expected Cost of Inspection also should, which is:
  • is the Cost of an Inspection,
  • is the Cost of Restoration (of item’s resistance to failure),
  • is the probability inspection will falsely indicate potential failure.
Expected cost of preventive maintenance (ECP) could be calculated by:
If the ECP is more than ECC, the corrective maintenance is required for the component otherwise preventive maintenance is considered. The application of above method is very easy for each type of failure and component in both subsystems of the crushing plant.
We could be define a critical level of reliability for both subsystems which means the subsystem must be work at least with this level of reliability and based on the critical level of reliability the interval inspection can be identified. Table 3 shows the interval inspection time for conveyer and screen subsystems. From the table 3, for example the conveyer subsystem will be worked at least with 70 % of reliability if the inspection of preventive maintenance carries out after 6.8 hours.


Critical level of reliability

Time interval of inspection (hour)