What features are generated after the Engine processes the data in a transactional customer churn prediction app?

This article explains the features that are generated after the Engine processes the data in a transactional customer churn prediction app.

Features are known attributes used as input by machine learning models to predict the unknown target.

For the transactional option in the customer churn template app, the AI & Analytics Engine automatically generates a number of useful features from the customer transactional dataset and customer information dataset. These features represent various customer behavior statistics over different periods of time.

🎓 To learn more about the datasets needed for the customer churn template, read what datasets are required to use the transactional option in the customer churn prediction template?

Different types of aggregated features are generated over the selected time windows:

Amount-based features: minimum, maximum, standard deviation, and total amount spent and refunded.
Count-based features: number of transactions.
Time-interval-based features: minimum, maximum, standard deviation, and average number of days between transactions.
Recency features: days since last transaction.

The user can select the time windows used to generate these aggregated features when they are defining contributing factors.

Apart from aggregated features, additional demographic features from the customer info data are also included, if available.

For more information about “contributing factors”, read what do "contributing factors" mean in the customer churn prediction template?

Specifying 2 time windows Specifying two time windows. One is the most recent 30 days. The other is a 30-day range, 30 days ago.

As an example, consider the following credit-card transactions dataset and customer information dataset as input to the template:

Transaction dataset

Customer information dataset

Then, the following features will be generated by the engine. For the transaction activity based features, the suffixes such as _last_30d and _last_15d correspond to the time windows we have chosen while selecting the contributing factors:

Description	Feature name
amount-based features	min_amount_spent_last_30d min_amount_spent_last_15d max_amount_spent_last_30d max_amount_spent_last_15d total_amount_spent_last_30d total_amount_spent_last_15d stddev_spent_amount_last_30d stddev_spent_amount_last_15d
count-based features	number_of_spending_transactions_last_30d number_of_spending_transactions_last_15d
time-interval based features	min_days_btw_transactions_last_30d min_days_btw_transactions_last_15d max_days_btw_transactions_last_30d max_days_btw_transactions_last_15d avg_days_btw_transactions_last_30d avg_days_btw_transactions_last_15d stddev_days_btw_transactions_last_30d stddev_days_btw_transactions_last_15d
recency features	days_since_last_transaction
features from snapshot time	current_month current_week_number_in_year
time based features from customer info data	year_of_dob month_of_dob week_in_year_of_dob weekday_of_dob days_since_dob
other features from customer info data	district_id sex