PI.EXCHANGE | Blog

8½ Questions to Ask if You Want Your AI/ML Project to Deliver Results

Written by Asad Enver | Jul 2, 2021 9:20:08 AM

There has been so much hype and buzz generated around artificial intelligence and machine learning that many companies jump on the bandwagon thinking of them as some magic spells that will transform their business and take it to heights unimaginable before. More often than not, the bubble bursts, and companies realize AI/ML was not a good match for them in the first place.

Listen to the audioblog instead:

 

Don't know how AI/ML can help your business? Read the use cases across different industry verticals

We know it’s easy to get seduced by these technologies thinking they will usher in a golden age of unparalleled growth and prosperity. Just because everyone is talking about it or experimenting with it does not mean you should also invest your time and resources doing the same.

Here is a list of questions that you should have concrete answers to before deciding to proceed with your AI/ML initiatives. It’s always wiser to think along an extra dimension to clear your mind and be 100% sure what you are indulging in would be fruitful or just a waste of energy.

Do you have labelled data?

Supervised machine learning algorithms such as gradient-boosted decision trees, support vector machines, and neural networks require labelled data i-e we must know the input and the output in advance. That is how the model learns hidden patterns in the data and draws inferences. If you do not have access to labelled data, then you can either opt for unsupervised machine learning algorithms such as K-Means clustering or you can manually annotate your data. Data annotation can be extremely costly and challenging, especially if you want it to be done by a professional. This is especially true if your data consists of text or images.

 

Again, it should be reiterated that your model’s performance ultimately depends on the quality of data it ingests to learn and make predictions. Therefore, the data annotation process needs to be handled with utmost care and diligence.

Do you have suspicions about the data quality?

The output of your machine learning models will be as good as the quality of data they are fed as input. There are two main aspects to this problem: the first is to find enough data that will help the model learn. Neural networks are data-intensive algorithms and will require huge amounts of data to train. With the advent of big data, getting hold of data does not seem to be a major concern anymore. However, poor-quality data often hinders companies from building robust predictive models.

The data residing in databases and data warehouses are often unstructured or semi-structured and manifest themselves in several forms including inconsistencies, duplicates, outliers, and missing values, making data validation a time-consuming process.

High-quality datasets are the key to building effective AI/ML models.

Do you have the right (business savvy) person to lead the AI/ML project?

Machine learning and especially deep learning are highly complex fields that require a lot of technical knowledge and understanding of the subject to solve real-world problems. However, this does not mean that the person who will manage the project should be a nerd who understands the algorithms inside out (although we love nerds). It is essential that the project is owned and executed by someone who has keen business acumen and who knows exactly the business problem that AI/ML will solve.

Having in-depth industry expertise and domain knowledge is much more important for your project to deliver results than having a background in machine learning. In the end, it’s about execution, and a business person is often (not always) better equipped to talk about the AI use case in plain language without using any technical jargon so the key stakeholders can quickly grasp how AI will drive business growth.

Do you have the right talent pool?

AI/ML is still in its evolutionary stages and there is a dearth of specialists in this field. It is extremely difficult, time-consuming, and costly to assemble a team with the right mix of skills, knowledge, and experience. You may need data engineers, data scientists, machine learning engineers, cloud engineers, and software engineers who can work on different aspects of the project.

Now, it might be your wish to find a person who can wear multiple hats and do all the things for you. But when reality strikes hard, you realize that no such ‘superman’ or ‘wonder woman’ exists and you will require different people who can translate your vision into a specific use case.  

Always try to have specialists in your team rather than generalists.

Do you have the computational resources?

AI is computationally very expensive, not by accident but by design. While deep learning is known to model diverse phenomena making things such as self-driving cars and voice recognition possible, they also require specialized hardware and processing power. The computational needs of deep learning scale rapidly and therefore, you must be prepared to invest additional resources to speed up the training time as more data becomes available as this will help the model achieve better performance.

Have you considered the ethical challenges?

When you leave the decision-making process to machines, do not expect them to take ethical implications into account. Developing AI applications means humans may have minimal control over the output. This raises serious questions about fairness and transparency.

For example, banks might use machine learning to decide who should be given a loan. The output of the model might not be explainable to a customer as to why they have been refused. Hence, this can raise doubts about discrimination. Therefore, it is important to incorporate the beliefs, values, and perspectives of all involved stakeholders when building the application to address the ethical concerns raised by the large-scale deployment of AI.

Do you have a baseline model to serve as a benchmark?

What are you going to compare the results of your model against? What would constitute a good or satisfactory model? It is very important to have a baseline model that serves as a benchmark against which you can compare the performance of the model you will build. A baseline model is simple to set up, generates decent results, and is easier to interpret. It allows you to iterate and experiment without wasting much time. The more complex models that you build later should beat the performance of the baseline model or else there is no advantage in deploying a more complex model into production. 

It is perfectly fine if your baseline yields poor results as it may indicate that the problem you are trying to model is complex or that you may need to re-frame the problem. All you need is a meaningful reference point and use that to interpret the results of the more sophisticated models.

Linear regression, logistic regression, decision trees and KNN can serve as good baseline algorithms depending on the problem you are trying to solve.

Do you have the resources to deploy your models into production?

Building robust and efficient models is just one aspect. Equally important is to get them deployed. Deployment is the process of making your machine learning model available in production environments where they can interact and provide predictions to other software systems. It is only after deployment that your model can start adding value. 87% of data science projects never make it to production (Venturebeat)'.

There are multiple technology stacks that you will have to work with to build an AI/ML application that can be put into production and consequently used by the customers. Most of these technologies are still young and will continue to evolve and mature. You will be confronted with a breadth of challenges when trying to deploy your model into production. The most common ones being code that fails to compile in different environments, workflows that are difficult to align between cloud and on-premise infrastructure, and poor visibility into model performance when the input data changes. 

Moreover, it is quite difficult to find a full-stack data scientist who can manage the complete life-cycle of the model. Data scientists are well versed at iterating and experimenting but lack the skills required to deploy their models since model deployment draws upon skills that are more aligned with software engineering and DevOps. What you will need here is a machine learning engineer.

You cannot leave model deployment till the very end. You have to plan for it at a system level. The initial deployment might not be difficult but the ongoing maintenance and updates can make things challenging. Therefore, your business ecosystem must be able to seamlessly support your ML workflows if you want to extract business value from your models.

Have you aligned your expectations with reality?

AI/ML projects have many hidden layers (I am not referring to the hidden layers in neural networks here) that only become visible once the project is set in motion. There are a lot of uncertainties involved, from data collection to final model deployment that cannot be anticipated in advance. Then there is the added complexity of explaining the results of the model. While it’s easy to explain how a single prediction might have been made, it becomes increasingly difficult when the feature space gets bigger and the data set grows. This is known as the ‘black box’ problem where even the specialists find it difficult to interpret the model results. You might end up with a model with very high accuracy but if you cannot explain the results to your customers or other stakeholders involved, it might not be a wise decision to proceed with such an application. Or you might realize at the end that you don’t have enough data or the data lacks quality. Additionally, it might not be possible to exactly replicate the results of your model due to randomness.

Hence, you have to exhibit patience and acknowledge the inherent challenges that accompany AI/ML projects.

Wrapping Up

Don’t just jump on the bandwagon. Think, think and think again to understand whether AI and ML are what you really need to take your business to the next level. AI/ML is a high-risk, high-reward scenario. While the tech giants are investing billions in research and development, the small to midsize companies are also benefiting from the low-hanging fruits as the AI landscape matures. No code AI platforms have taken over the market by storm and are automating the entire machine learning life-cycle. So if you are crystal clear that you need AI/ML for a specific well-defined use case, then these no code machine learning platforms might serve as the ‘secret sauce’ for your business’ success.

 

Not sure where to start with machine learning? Reach out to us with your business problem, and we’ll get in touch with how the Engine can help you specifically.