Predictive maintenance: a technological breakthrough owing its advent to the development of the data technologies now associated with machine learning…
AUSY is increasingly involved in a growing number of predictive maintenance projects, mainly for the aviation sector.
There are a large number of applications in this area that involve essentially:
- aircraft;
- ground equipment;
- resources and equipment for engineering and manufacturing.
In all cases, the primary concern is straightforward: avoid failures. To do so, we need to identify potential weak spots before malfunctions occur. The second major issue is more to do with economics: improving the performance and operation of the platforms and equipment.
The main benefits of predictive maintenance thus include both reductions in the costs of engineering, manufacturing and operating, and also significant improvements in reliability and safety.
For instance, the systems currently in use within civil aircraft can significantly reduce fuel consumption (by typically between 1% and 5%), increase the fleets’ operational reliability and availability, and provide greater security with the automation of flight authorizations.
A predictive maintenance system has essentially three major components:
- a data acquisition and transfer solution;
- a “Big Data” infrastructure including data and application security for storing data and computing weight calculations;
- combined analytics solutions to analyze data in detail and produce predictions and recommendations.
AUSY is involved at various levels in these three areas.
Data acquisition and transfer
Data acquisition, the first indispensible stage, operates in two distinct ways. For equipment, a network of sensors is associated with a connectivity solution. For aircraft, a specialized computer is coupled with a high-capacity memory board.
For fixed and mobile equipment, the data is transmitted directly as it is acquired to an intermediate IoT platform. Connectivity relies either on protocols such as 4G, WiFi or PLC (Power Line Communication), or else either on standard networks (generally TCP/IP) or specialized ultra narrow band/low consumption networks, such as LPWANs (Low Power Wide Area Network) like SIGFOX or LoRa. 5G and WiFi 6 will help ratchet up both performance and security, and are eagerly awaited.
For aircraft, where the acquisition device is on board, recent technological developments mean data can be recovered without removing the component, as was the case before.
Two types of transfer are commonly used at the moment:
- in real time (during flight), typically via a Satcom link, which entails a limit on the volume of data transmittable. Only samples or incomplete data will be transmitted;
- after each flight, via a 4G or WiFi connection: all the acquired data is transmitted.
For most of the solutions in use, operators are equipped with tablets that may be used (relatively) near the aircraft to initiate remote transfer. The data is either collected in intermediate secured boxes or is sent directly to the system that loads and stores the data. Usually, an off-line mode is available so this operation can be deferred.
For the last few years, the acquisition equipment has become much more powerful, allowing more and more parameters to be measured.
The FOMAX (Flight Operation and MAintenance eXchanger) unit developed by Rockwell Collins and available since 2018, is today the performance market leader. It manages the mass acquisition and transmission of aircraft data and can record up to 30 gigabytes per flight (at a maximum rate of 10 gigabytes per hour). It incorporates a super router connected to a dictionary of the data produced by the avionics systems. Its measuring frequency varies from 1 to 400 hertz, virtually equivalent to real-time monitoring. Security is ensured by isolating the different interfaces and the different software layers, and by encrypting the data before it is sent.
FOMAX was designed for the new generations of aircraft that natively supply a large number of parameters, but it can also be adapted to older designs.
Since 2017, it has gradually been retrofitted in A320s and A440s to increase the number of parameters monitored, and the results are impressive. Parameter numbers have risen from 400 to 24,000 and from 1,500 to 40,000 respectively. Figures for the new generations are even higher: 350,000 parameters for the A380 and up to 800,000 for the A350.
Big Data infrastructure
In view of the significant volume of data processed and manipulated by predictive maintenance systems, they must be based on a robust and scalable infrastructure, and all the existing implementations are based on Big Data architectures. They incorporate the three main functions: high-performance integration, storage and computation, in as secure an environment as possible.
At this stage, all options are open, and there seems to be no major emerging trend in terms of choice of architecture. We thus find implementations based on public and private clouds.
When the choice is for open source, Hadoop and Spark are the defaults. Hadoop is now more-or-less systematically associated with Spark for faster distributed computing: today, Spark is unequivocally the most powerful framework in the field. NoSQL (not only SQL) databases are used for storage, notably MongoDB, which is used in several systems currently in operation. For data integration, Talend is the leader for its proven, high-performance solutions.
A large-scale and now well-tried alternative is Amazon AWS, which offers three key services:
- S3 – Simple Storage Service, for data loading and storage;
- EC2 – Elastic Compute Cloud for sizing and configuring computing capacity dynamically. It can be used to build machines and execute applications simply. In particular, server instances can be started in a few minutes, and capacity can be increased or reduced depending on the computing requirements;
- AWS Lambda – A serverless computing service, with automatic capacity sizing and scaling. It enables tasks to be initiated by events.
Time Series Databases have subsequently appeared over the last few years. These solutions are particularly useful for managing data from repetitive measurements: they are DBMSs (DataBase Management Systems) specifically for time-series data, with optimized performance. They allow fast loading and interrogation of data of this type, with high availability, and have time functions to query data structures comprising series of measurements and values.
The best-known solution is Influx DB, initially released in 2013. It is developed in open source under licence from MIT (Massachusetts Institute of Technology).
Analytics
To be efficient, a predictive maintenance system relies on three major datasets which may be used to analyze:
- descriptive data for all the parts, components, sub-systems and systems in an aircraft, an item of equipment or a set of equipment (e.g. ground system or production line);
- historical data about parts and components throughout their life cycle;
- so-called “operational data” for aircraft and equipment.
The first two datasets are those normally found in any industrial maintenance system.
The third is of special interest to us in converting standard maintenance into predictive maintenance. It holds time-series data from repetitive measurements and is structured in a particular way. Time series thus comprises data that is ordered and organized in parameter/value pairs, each with an added timestamp. In the most advanced systems, the time grain is expressed in nanoseconds, so that we can work on data acquired at a higher measuring frequency (virtually real time). This structure allows us to observe directly how a value has changed over time, and to analyze the trends for future changes.
As regards the analytics toolset, there is no miracle solution, and those we have are very specific and customized for each business environment and each application area.
Nevertheless, we can draw up a list of the techniques most frequently used, although they are rarely all found together in any one maintenance system:
- Bayesian networks are probabilistic graphical models used to determine the probability of an event’s occurrence.
- Fault or breakdown trees, generally compiled from practical experience, help us profit from known causes of breakdowns and malfunctions.
- Knowledge bases (or expert systems) coupled with inference engines can link observations and parts or components, systems or sub-systems.
- Neural networks are a route to building predictive models.
- Advanced data visualization systems are also very effective for viewing time curves, observing discrepancies and anomalies, and for visually identifying low-level trends. As regards this last instance, Ausy has developed data-representation models that can reveal clusters of isolated or inconsistent values.
When using neural-network and deep-learning techniques, the first step is to construct a model of “normal” functioning. This presupposes that a large amount of historical data for the largest possible number of cases is already available.
By “normal functioning” we mean a set of patterns for changes in parameters measured over time that are constant for a set of operated flights (for instance for an aircraft) during which no particular anomaly has been detected. This may cover different types of functioning: take-off, ascent, flight, descent, landing, etc.
The basic model is adapted and validated for each time series i.e. for each parameter.
The basic model is then enhanced and complexified by building co-occurrence matrices for several series where the parameters are interdependent (e.g. oil pressure and temperature). The final result is an analysis of the overall functioning.
During the first stage of analysis, we consider an anomaly to be any deviation from normal functioning. We assess the size of the deviation and hence the potential seriousness of the anomaly simply by calculating the distance.
The time taken to build an initial model and generate efficient algorithms is generally between 2 and 3 months. It is important to stress that each model focuses on a very specific range of values and parameters and therefore relates to a particular item of the equipment or platform under analysis (landing gear, rotor or avionics computer, etc.).
During the second stage, explanatory models are created to identify the causes of the anomalies or deviations detected. Here again, a neural network may be used to analyze the different factors that could contribute to a particular problem.
For instance, if the pressure in a fuel pump drops unexpectedly and slightly reduces the supply to an engine, our first analysis will identify a problem with the engine, which we will then correlate with a pump malfunction.
The third stage allows us to make predictions: we run the explanatory model and compare it with the normal model, to identify early signs of a component failure and deduce from it the best time to intervene.
In the case of the pump and the engine, if the problem worsens or is present during several flights, we can probably estimate the actual impact and breaking point, and recommend that the pump is immediately repaired or replaced, even if theoretically it is supposed to work for “some time” longer.
The final stage is to store the projections and re-populate the models so that they are continuously improved. At this point, deep-learning techniques are highly recommended, because they use neural networks that can learn independently (or unsupervised). Given the volume and variety of the data processed, this is proving essential.
Over the last few years, offers of predictive maintenance services have proliferated on the aeronautics market. Rolls-Royce is probably one of the frontrunners, with Engine Health Management, marketed since the beginning of the 2010s. Today, initiatives and offerings from manufacturers, system providers (OEMs) and operators are multiplying: Airbus with Skywise, Safran Analytics with BOOST, AirFrance Industries (AFI KLM E&M) with Prognos – and also Michelin, which now sells tyres that are invoiced based on the number of landings.
This step change and these investments are more than justified by the results already obtained. A fault is now detected between 10 and 20 flights before the actual failure. The wholesale reduction in unplanned maintenance means that aircraft downtime is minimal.
The other strength is the high reliability of the predictions: in all published cases the fault found was confirmed by the supplier’s diagnosis (of the parts or components concerned).
It is also very effective for remedial maintenance after a failure: a maintenance system based on predictive models allows the causes to be identified in 5 minutes, rather than 6 hours for standard methods.
In addition, the systems already commissioned have proved very valuable in improving engineering and manufacturing. One example is the A350, where a design fault in an equipment item was resolved in 2 weeks rather than the normal 24 months. The costs of poor quality have reduced by 30%, and production costs and development time are now significantly lower (by 20% à 30%).
Airbus’s projections for future programmes are even more ambitious, aiming to reduce both cycle time and development costs by 50% between 2025 and 2030.
The growing ambition of all those involved in the sector, but above all the operators, is to fly each aircraft for 18 hours per day - or even more.
The Holy Grail for predictive maintenance is to be able to issue automated diagnostics in real time, and activate a search system for low-level trends during flights. We are told this is planned for 2021.