Software Traces for Detecting Frauds in BL.Enfance

June 5, 2020

Tracability

4 minutes read

Fraudulent behaviors can be detected based on the applicative traces. In this work, we used application traces from BL.Enfance, one of our software dedicated to children’s activities billing and management. We specifically focused on detecting the modification of variables such as the family quotient (Q-CAF) and period of billing, which can be modified to change invoices’ amounts.

BL.Enfance Fraud Case & Traces’ Structure

Among the possible fraud cases in the BL.Enfance application, «fraud over the CAF quotient » is the easiest to handle with. It consists of the alteration of a CAF quotient over an already completed billing period (for the same payer). This fraud scenario is generally carried out through these three use cases:

UC1: « giving a Q-CAF value over a specific period »
UC2: « bill calculation over the same period »
UC3: « modification of the given Q-CAF for the previously calculated period »

To detect the execution of each of these three use case, we relied on four specific events, which are:

CREATION_QUOTIENT : The modification of a CAF quotient
SUPPRESSION_QUOTIENT: The deletion of a CAF quotient
FACTURATION_CALCUL_FACTURE_INDIVIDUELLE: The computing of one individual bill
FACTURATION_SUPPRESSION_FACTURE_INDIVIDUELLE: the deletion of one individual bill

BL.Enfance traces are composed of nine attributes, as depicted here.

Fraud detection as a Time-Serie Analysis

Before getting into the proposed fraud detection prototype, the traces have to pass through two phases first:

Traces extraction phase : Events for which the “action” field has one of the following values:

CREATION_QUOTIENT
SUPPRESSION_QUOTIENT
FACTURATION_CALCUL_FACTURE_INDIVIDUELLE
FACTURATION_SUPPRESSION_FACTURE_INDIVIDUELLE

The extraction is done from the mongoDB dataset.

Anonymization phase : The anonymization concerns the fields:

Login
IP
ValeurActuelle
IdClient

Use of an SHA 2-256 hash function (reproducible and non-invertible).

We also need to consider the user who initiated the actions to see if the activities of altering the family quotient and that of invoicing in the suspicious case come from the same user or not and the durations between actions to analyze the time between the alteration of the family quotient and the billing which should be short in a suspicious case.

A Simple 5 Steps Algorithm

Step 1: reading of anonymized data and construction of a correspondence table: payer => ordered list of events related to this payer:

Step 2: construction of the activity periods invoiced from the events: FACTURATION_CALCUL_FACTURE_INDIVIDUELLE and FACTURATION_SUPPRESSION_FACTURE_INDIVIDUELLE

Step 3: construction of the periods subject to the Q-CAF quotients with the events: CREATION_QUOTIENT, SUPPRESSION_QUOTIENT

Step 4: Overlay of the periods.

Step 5: Detection of suspicious contexts according to the order of actions performed in the software (for the same payer).

Results and statistics

Below we provide with some quick numbers about the results on the production database in BL.Enfance.

*Results obtained on the production data of BLEnfance de*
*early February to 04/23/2020*

Total number of traces: 35466
Number of payers: 25204
Number of activities per payer: 1,41 activity/ payer
Frequency of payer appearance: How many payer has number of activities X

Percentage of each action (production data)

Number of action per payer (production data)

Conclusion and future works:

To conclude we can say that we still have to work on an other type of billing that BLEnfance offers as well which is the « grouped billing »
This type of billing is the one where we can find more possible fraud cases in real life
After doing this second step we are planning to generalize the fraud detection prototype using some machine learning technics or maybe using some graph navigation tools
But until this hour, we didn’t get access to the traces with the labels that should allow us to continue working

Until getting the needed traces we investigated other “historical” fields. In the “CAF data” and “additional accounting data” form, we have identified 8 fields that may be a candidate for fraud detection variable:

More ...

In details

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Software Traces for Detecting Frauds in BL.Enfance

BL.Enfance Fraud Case & Traces’ Structure

Fraud detection as a Time-Serie Analysis

A Simple 5 Steps Algorithm

Results and statistics

Conclusion and future works:

More ...

[FR] Discover what are Graph Neural Network (GNN) in comic

The SoftScanner project: How to trace your front-end automatically?

The Traceability chronicles – Episode 1: How logs can help you understand what your users do?

Analysis of user activity (traces) in software

Software Traces for Detecting Frauds in BL.Enfance

BL.Enfance Fraud Case & Traces’ Structure

Fraud detection as a Time-Serie Analysis

A Simple 5 Steps Algorithm

Results and statistics

Conclusion and future works:

More ...

[FR] Discover what are Graph Neural Network (GNN) in comic

The SoftScanner project: How to trace your front-end automatically?

The Traceability chronicles – Episode 1: How logs can help you understand what your users do?

Analysis of user activity (traces) in software

Start typing and press enter to search