The Importance of Explainable AI (XAI) in Computer Vision
The 7 benefits of XAI for CV: uncovering systematic bias, explaining edge cases, improving fairness, safety and model efficiency, enhancing user interaction and building trust in machine learning

Like many millennials, I have satisfied my need to nurture with an unreasonably large pot plant collection. So much so that I have completely lost track of them. In an attempt to remember their names, yes they have names, I decided to build a machine-learning algorithm.

I started with the four plants you see below — Rudo, Baya, Greg and Yuki. During the process of building this pot plant detector, something went horribly wrong…

I dropped Rudo’s pot!

Thankfully, this was a blessing in disguise. After replacing the pot, I found the model’s accuracy dropped dramatically. Specifically, it went from 94% on validation data to 70% on test data. All of the mistakes were coming from Rudo being misclassified as Baya. Clearly, something had gone wrong during the model training process.

example instances from the Pot Plant dataset. Rudo is in a new pot for the test set after dropping its original pot.
Figure 1: example instances from the Pot Plant dataset. Rudo is in a new pot for the test set after dropping its original pot.

With the help of an XAI method called Grad-CAM [3], I discovered the problem. The model was using pixels from the pots to make classifications. Only Baya was in a blue pot in the training set. As Rudo was now in a blue pot, the model was misclassifying it. This is not what we want from a robust plant detector. It should be able to classify the plant regardless of what pot it is.

GradCAM heatmap for an instance with an incorrect classification. It indicates that the most important pixels for the classification are from the pot.
Figure 2: GradCAM heatmap for an instance with an incorrect classification. It indicates that the most important pixels for the classification are from the pot.

This is an example of the most important benefit of XAI for computer vision. That is identifying unknown or hidden bias in machine learning models. We will discuss this along with the other benefits summarized below. Really, this section aims to convince that XAI goes beyond simply understanding a model.

A summary of the 7 benefits of XAI for computer vision models
Figure 3: the 7 benefits of XAI

The insights of XAI methods can reveal issues with model robustness and accuracy. They allow us to be more certain that the models we developed will not cause any undesired harm. They can even help reduce the complexity of our AI systems. These are all fatal problems. Identifying them before deployment can save you a lot of time and money.

If you are interested in XAI, then you will find these courses useful:


Identifying systematic bias (mistakes) in machine learning models with XAI

Bias in machine learning is a dynamic term. Here, we refer to bias as any systematic errors that result in the model

  • making incorrect predictions or
  • making correct predictions but with misleading or irrelevant features

XAI can help explain the first problem but it is with the latter where it becomes particularly important. We refer to this as hidden bias as it can go undetected by evaluation metrics.

Okay, so for our pot plant detector, we did know there was bias. But this was only after we replaced Rudo’s pot. If we hadn’t, then the evaluation metrics would have provided similar results for all datasets. This is because the same bias would be present in the training, validation and test set. That is regardless of whether the test images were taken independently of the other data.

To a seasoned data scientist, the bias of brightly coloured pots may have been obvious. But, bias is sneaky and can be introduced in many ways. Another example comes from the paper that introduced LIME [2]. The researchers found their model was misclassifying some huskies as wolves. Looking at Figure 4, we see it was making those predictions using background pixels. If the image had snow, the animal was always classified as a wolf.

Raw data and explanation of a bad model’s prediction in the “Husky vs Wolf” task.
Figure 4: Raw data and explanation of a bad model’s prediction in the “Husky vs Wolf” task. (source:[2])

The problem is that machine learning models only care about associations. The colour of the pots was associated with the plant. Snow is associated with wolves. Models can latch on to this information to make accurate predictions. This leads to incorrect predictions when that association is absent. The worst-case scenario is when the association only becomes absent with new data in production. This is what we aim to avoid with XAI.

Identifying unfairness with XAI

Unfairness is a special case of systematic bias. This is any error that has led to a model that is prejudiced against one person or group. See Figure 5 for an example. The biased model is making classifications using the person’s face/hair. As a result, it will always predict an image of a woman as a nurse. In comparison, the unbiased model uses features that are intrinsic to the classification. These include short sleeves for the nurse and a white coat or stethoscope for the doctor.

Grad-CAM used to explain a biased model that classifies people as nurses and doctors
Figure 5: Grad-CAM used to explain a biased model (source:[3])

These kinds of errors occur because historical injustice and stereotypes are reflected in our datasets. Like any association, a model can use these to make predictions. But just like an animal is not a wolf because it is in the snow, a person is not a doctor because they have short hair. And just like any systematic bias, we must identify and correct these types of mistakes. This will ensure that all users are treated fairly and that our AI systems do not perpetuate historical injustice.

Explaining edge cases with XAI

Not all incorrect predictions will be systematic. Some will be caused by rare or unusual instances that fall outside the model’s typical experience or training scope. We call these edge cases. An example is shown in Figure 6, where YOLO has falsely detected a truck in the image. The explanation method suggests this is due to “the white rectangular-shaped text on the red poster” [1].

Grad-CAM explanations for a false positive detection
Figure 6: explanations for a false positive detection (source:[1])

Understanding what causes these edge cases will allow us to make changes to correct them. This could involve collecting more data for the specific scenarios reflected in the edge case. We should also remember that a model exists in a wider AI system. This makes a decision or performs an action based on the prediction from a model. This means some edge cases could be patched with post hoc fixes to how the predictions are used.

Improving AI safety

Correcting edge cases is strongly related to improving the safety of models. In general, ML safety refers to any measure taken to ensure that AI systems do not cause unintended harm to users, society, or the environment. It is about ensuring that our models behave predictably under expected conditions as well as unexpected conditions. This can only be done by understanding how models are making predictions.

AI safety is always a concern. This is especially true when the systems interact with our physical world. We go from worrying about social media algorithms serving content that harms mental health to systems that can cause physical harm. Think about how accidents involving automated vehicles are always in the news. It is the edge cases, missed by these systems, that lead to loss of human life.

Improving model efficiency with XAI

Until now, we have focused on how XAI can improve model performance and robustness. When deploying models we must take a broader view of performance. We need to consider aspects like the speed and complexity of the entire AI system. In short, we are concerned with the model’s resource use. This is referred to as model efficiency.

Using XAI, we can learn things that can help improve efficiency. For example, take the lesson where we used permutation channel importance to explain a coastal image segmentation model. Looking at Figure 7, we found that the model was only using 3 of the 12 spectral bands. We could potentially train a model using only these bands and significantly reduce our data requirements.

permutation channel importance scores for a coastal image segmentation model
Figure 7: permutation channel importance scores for a coastal image segmentation model

Another example comes from the lesson where we used Grad-CAM to explain a model used to classify images of house plants. The model was a CNN with three convolutional layers. Looking at Figure 7, we saw that each layer extracted similar features from the images. This suggests a simpler network, with fewer layers, would provide similar performance.

separate Grad-CAM heatmaps for each of the three convolutional layers in the network.
Figure 7: separate Grad-CAM heatmaps for each of the three convolutional layers in the network.

These kinds of insights can be used in future model builds to improve model efficiency. By only including data needed for accurate predictions, we can significantly cut down data processing time and storage space. Simpler input features and architecture will all provide faster predictions. The added benefit is that these simpler systems will be easier to interpret and explain.

The benefits we’ve mentioned up until now can all be summarised as debugging or improving a model. We want to identify any mistakes or inefficiencies. This is to improve the performance, reliability, fairness and safety of the resulting AI systems. The next benefits are different. They are more about the humans who interact with the models.

Enhancing user interaction with AI systems

For many applications, XAI is necessary to provide more actionable information. For example, a medical image model can make a classification (i.e. a diagnosis). Yet, on it’s own this has limited benefits. A patient would likely demand reasons for a diagnosis. It also does not provide any information on how severe the condition is or how it could best be treated.

Looking at Figure 8, you can see how the pixels that contributed to the positive diagnosis have been highlighted. Along with the classification, this provides more actionable information. That is the doctor can confirm the classification, explain where and how severe the tumor is and begin to understand the best course of action.

Grad-CAM visualization showing the areas of the brain that have contributed to positive tumor diagnosis
Figure 8: Grad-CAM visualization showing the areas of the brain that have contributed to positive tumor diagnosis (source:[4])

In general, XAI can help us make use of machine learning in industries where it is too risky to fully automate a task. Using the output of XAI, a professional can apply their domain knowledge to ensure a prediction is made in a logical way. The output can also tell us how to best go about addressing a prediction. Ultimately, through this type of interaction, we enable better human-machine collaboration.

Building trust in machine learning models

Even for applications that could be fully automated, XAI can help increase the adoption. Machine learning has the potential to replace processes in finance, law or even farming. Yet, you would not expect your average farmer or lawyer to have an understanding of neural networks. Their black-box nature could make it difficult for them to accept predictions. Even in more technical fields, there can be mistrust of deep learning methods.

“Many scientists in hydrology remote sensing, atmospheric remote sensing, and ocean remote sensing etc. even do not believe the prediction results from deep learning, since these communities are more inclined to believe models with a clear physical meaning.”— Prof. Dr. Lizhe Wang

XAI can be a bridge between computer science and other industries or scientific fields. It allows you to relate the trends captured by a model to the domain knowledge of professionals. We can go from explaining how correct the model is to why it is correct. If these reasons are consistent with their experience, a professional will be more likely to accept the model’s results.

Ultimately, when stakeholders understand the reasoning behind model predictions, it is easier to secure buy-in for AI solutions. This is particularly true in sensitive environments. In fact, as the world becomes more critical of AI and introduces more regulations, XAI will become necessary to meet legal and ethical requirements.

So hopefully you are convinced. XAI can help you identify and explain systematic biases and edge cases in your models. Its lesson can not only improve performance but also the efficiency, fairness and safety of your AI systems. Through building trust and improving interactions, it can even increase the adoption of machine learning.


I hope you enjoyed this article! See the course page for more XAI courses. You can also find me on BlueskyThreads | YouTube | Medium 


Additional Resources

Datasets

Conor O’Sullivan, & Soumyabrata Dev. (2024). The Landsat Irish Coastal Segmentation (LICS) Dataset. (CC BY 4.0) https://doi.org/10.5281/zenodo.13742222

Conor O’Sullivan (2024). Pot Plant Dataset. (CC BY 4.0) https://www.kaggle.com/datasets/conorsully1/pot-plants

References

[1] Armin Kirchknopf, Djordje Slijepcevic, Ilkay Wunderlich, Michael Breiter, Johannes Traxler, and Matthias Zeppelzauer. Explaining yolo: leveraging grad-cam to explain object detections. arXiv preprint arXiv:2211.12108, 2022.
[2] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. “why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016.
[3] Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pages 618–626, 2017.
[4] Mahesh T. R, Vinoth Kumar V, and Suresh Guluwadi. Enhancing brain tumor detection in mri images through explainable ai using grad-cam with resnet 50. BMC Medical Imaging, 24(1):107, 2024.


Get the paid version of the course. This will give you access to the course eBook, certificate and all the videos ad free.