Nowadays, industrial systems maintenance uses the principle of Digital Twin to predict future issues. Sensors placed on the different parts of the system emit a signal that characterize its behavior. The study of these time signals in real time allow us to detect unusual behaviors preceding a breakdown. However, the sensors signals can be unreliable due to an incorrect placement of the sensor, its misfunctioning or an issue during the signal transmission. Consequently, the study can be biased, and the conclusions can be wrong. Therefore, it is important to be able to score the reliability of the signal before studying it.
In this work, we want to develop a solution using the Machine Learning’s state of the art in the field of outlier detection to be able to detect the anomalies in the sensors signals. Our method must be unsupervised, which means that we cannot train models with known labelled (normal or abnormal) behaviors because it would require an expert to meticulously look at all the generated data, and generalizable, which implies being able to deal with a lot of very different types of signals. On top of that, we must be able to make the difference between system “normal” anomalies (that can be seen as causes of breakdowns) and sensors anomalies. This latter constraint should be settled with the joint study of the various signals from the whole system.
A study of the outlier detection field
An outlier is usually described as “an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism” [Hawkins, 1980]. [Chandola, Banerjee, Kumar, 2007] defined three types of outliers:
[Chandola, Banerjee, Kumar, 2007] defined three types of outliers:
Point anomalies, as an individual data instance that is anomalous with respect to the rest of the dataset.
Contextual anomalies, as an individual data instance that is anomalous with respect to the context of its observation.
Collective anomalies, as a group of data instances that are anomalous with respect to each other.
The paper also described three types of outlier detection methods, with respect to the knowledge on the label of the data:
- Supervised methods when techniques learn models using the label of each data instance (normal or abnormal at least).
- Semi-supervised methods when techniques learn the normal model using only normal instances, or sometimes abnormal models using only abnormal instances.
- Unsupervised methods when the labels are not provided, and the techniques need to learn from the observation of the dataset.
The output of such methods could be:
- A score of the degree of abnormality for each instance.
- A label (normal or anomalous) that can be given for example with a threshold on the score.
Work done so far
So far, we have mainly worked on data from industrial case (a luggage conveyor from ALSTEF Automation and facility management case (one of our smart building).
- The speed of the belt
- The engine intensity
- The oil temperature
- The engine temperature
The point anomaly detection can be done using statistical or density-based methods on the signal’s values. However, the shape of the signal mainly depends on the operating mode of the conveyor which is not given.
Our solution must be able to learn the normal patterns to be able to detect contextual anomalies. To this purpose, we tried to train models that learn to predict the values using temporal contextual data computed locally such as the mean, the standard deviation, the median, etc.
To be able to detect collective anomalies, we learned to predict sequences of data using recurrent neural networks.
Because we want to give a score of reliability for each sensor at every moment, we chose to use a score as the output of our method.
On the example, the context is considered which leads to a high score at the beginning and the end of the operating phase and to the detection of the low value.
Eventually, we developed an algorithm that learn to successively isolate the three types of outliers until the dataset is theoretically normal. From now on, we want to evaluate the performances of our models.