Unpredictable and unplanned failures of a cooling tower drive shaft was causing reduced levels of cooling resiliency in a data-centre operation.


We undertook a Reliability Centred Maintenance study on the asset to determine the impact of various failure modes and develop prioritised controlling actions.


The highest impact of all possible failure modes was that of drive shaft imbalance (this in itself as a result of several possibilities) leading to bearing failure. Newly developed miniature infra-red sensors were installed on critical bearings with data output remotely monitored couple with a trend alarm. Maintenance regimes were also amended to manage causes of shaft imbalances.


The mean time between failures (MTBF) increased from 2,100 hours to in excess of 20,000 hours and uptime increased to 99.999%. Planned shutdowns resulted in reduction of maintenance costs of £15k pa per unit and avoidance of potential system failure costs of £10k per minute in lost transactions