How to apply Christoffel-Darboux kernel on online anomaly detection with few parameterization

September 23, 2021

Artificial Intelligence, BL.Predict, IoT

3 minutes read

Kevin Ducharlet is Ph.D. Candidate in the DRIT team. Since a year and a half, he started his thesis entitled: “Certification and confidence in sensor data: detection of outliers and abnormal values in time series.” Sensor data are generated using devices which measure a physical asset’s behaviour. These informations can be used to inform or input another system or to guide a process. The final objective of his project is to certify the quality of sensor data by developing an anomaly detection software suggesting a solution to the software it is paired with.

To certify sensor data, we chose to work on anomalies detection which allow to assign a normality rate for each measure. We started working on the method presented in this article to develop a common method, simple to read into and practicable on any industrial system without specific parameterization. A difficult condition in the state of the art. We are working on a method using the Christoffel-Darboux kernel to obtain a wrap around a point cloud. This method hasn’t been used much in data analysis until now, though it has great assets in multivariate time series (to measure a phenomenon, a multivariate time series has more than one time-dependent variable. Each variable depends not exclusively on its past values but also has some dependency on other variables. This dependency is used for forecasting future values) anomalies detection.

The characteristics of this solution are:

The model can be generated with an entire parameter, a considerable quality/advantage compared to other methods.
The model is responsive, it can be updated very quickly. With a low number of variables, the model calculates an observation rate before a new variable comes in.
When most methods need a contamination rate to set the decision threshold on the score, this method set up a reference threshold which depends on the parameter d and the number of variables. However, the result quality with this threshold depends on the application and the parameter d chosen.
Calculation complexity does not depend on the observation’s numbers, but on the variables number and the parameter d. This is a great asset on big data calculation with small variables.

(Source: Lasserre, J. B., & Pauwels, E. (2019). The empirical Christoffel function with applications in data analysis. Advances in Computational Mathematics, 45(3), 1439-1468.)

To pass by the threshold limit, we use another property from the rate function. If we generate various models with different parameter d, the score growth depending on d will be polynomial for regular points and exponential for anomalies. Based on this, we can generate differents models and study the score growth to take a decision without fixing a threshold. Even with the need to update models and measure each observation, we maintain a good calculating speed in this application.

What comes next?

Computational instabilities may appear depending on data standardization, we are working on this dysfunction. Then we’ll realise a scientist publication on this work and its application on real data to have feedbacks on the method.

To be continued…

More ...

Innovation

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

How to apply Christoffel-Darboux kernel on online anomaly detection with few parameterization

More ...

Not all Hallucinations are Good to Throw Away When it Comes to Legal Abstractive Summarization

Information extraction from Berger-Levrault books and articles

Berger-Levrault and CNRS sign a strategic partnership

Active learning for intelligent document processing

How to apply Christoffel-Darboux kernel on online anomaly detection with few parameterization

More ...

Not all Hallucinations are Good to Throw Away When it Comes to Legal Abstractive Summarization

Information extraction from Berger-Levrault books and articles

Berger-Levrault and CNRS sign a strategic partnership

Active learning for intelligent document processing

Start typing and press enter to search