PI.EXCHANGE | Blog

Feature Importance in Machine Learning - The AI & Analytics Engine

Written by PI.EXCHANGE | May 16, 2024 6:27:02 AM

Feature importance is an important yet often overlooked concept in machine learning. In this blog, we’ll discuss  what it is, it’s significance in understanding predictive machine learning models, and how you can take advantage of it in the AI & Analytics Engine.

What is feature importance in machine learning

Machine learning models learn how to create a mapping between a set of input features (columns of a training dataset) and output predictions generated by the machine learning model. However, not all features have equal contribution to the prediction.

Feature importance is a method of calculating an estimated score that displays the relative impact of each feature on the generated predictions. These scores allows features to by ranked by the average influence each has on predictions.

Prediction explanation is a closely related model insight which explains how specific values for features impacts the resulting prediction.

Why is feature importance useful

Understanding feature importance values provides crucial insights into model performance and has two main benefits.

Model improvement

Feature importance is a key piece in the model evaluation process, and can be used to iteratively improve the model.

Feature importance values might highlight irrelevant or redundant features that can be removed (dimensionality reduction). This reduced complexity of data can improve model accuracy by preventing overfitting, and reduce computation time.

Model Interpretability

Oversight on how a ML model arrives at predictions, otherwise known as model interpretability, is critical to ensure fairness, accountability and ethical use, especially in sensitive domains.

Feature importance values ensure transparency of ML models, where humans can understand what features, and to what degree, influence a prediction.

Additionally, having these values and rankings aid in easily communicating model performance to other stakeholders.

The Feature importance tool in the Engine

For any regression, binary classification or multi-class classification model, the feature importance tool sits within the AI & Analytics Engine’s Insights tab.

Simply click Generate, and the importance values will be presented as percentages in descending order from the most influential to the least.

For a multi-class classification model, you can specify the feature importance values for each possible class.

From here, you can decide if you want to remove specific features, by retraining the model and manually selecting features.