Promoting Trust: From the Closed-Box towards Explainable AI

Posted by Peter Rudin on 19. April 2024 in Essay

Explainable AI          Picture Credit:


Artificial Intelligence (AI) is experiencing an explosion of media coverage, research and public attention. It is also gaining massive popularity in organizations and enterprises, implementing Large Language Models (LLMs), Stable Diffusion or the next trendy AI product. Alongside this trend, there is a second boom focused on Explainable AI (XAI), helping computationally inefficient humans to understand how AI ‘thinks.’ With applications covering the same technologies as present-day AI, XAI is designed to promote trust and transparency, enhancing user experience in data science and AI as well. In an effort to break the limitations of ‘closed-box’ (also called ‘black-box’) thinking, XAI promotes new ideas and methods, bringing  AI to the next level of utility.

Current Research Efforts to Build XAI

Most research on explainability reveals that a precise definition is lacking. However, there seems to be agreement on how  XAI can best be approached to gain new insights. The following provides a framework for discussion:

  • Understandability by simulating tasks with a computational model.
  • Algorithmic transparency whereby a user can explain how an input results in an output.
  • Interpretability explaining the meaning of a model in a human’s decision-making process.

Ultimately these definitions and the underlying framework complement each other. The main challenge with this framework is a researcher’s comprehensive and intelligent capacity to understand and interpret the various approaches. Nevertheless, different ways of saying ‘I understand what my model is doing’ can lead to erroneous results.  For example, to focus one’s research on a decision tree model which is easily explained by its design, varies significantly from the complexity of an Artificial Neural Network (ANN) and its goal to model human behaviour. Could one approach be transparent but not understandable or explainable but not interpretable? Fact is that interpretable models have the advantage that they can be very well tailored to a specific domain. As a result, transparent models can be a useful co-pilot supporting human decision-making.

History of AI’s Usage as a Tool

To most users machine learning is just a useful tool as humans possess the unique ability to use tools for achieving a specific goal. Benjamin Franklin, one of the founding fathers of the United States and a remarkable scientist and inventor, is well-known for his observations on human ingenuity and tool usage. Transferring his ingenuity to the application of cognitive science and psychology, one can conclude that humans tend to use tools differently and apply them in novel ways. Distinguishing humans from other species is a critical factor in human development. Anthropologists have highlighted the role of tool-making and tool-usage  by emphasizing the ability of our ancestors to invent and repurpose tools. The English Philosopher Thomas Hobbes (Born April 5, 1588, Graduate of the University of Cambridge) suggested that animals start with a desired purpose and discover a useful instrument, whereas humans view everything as a potential tool and imagine all of its potential applications. AI only makes ambiguous prescriptions about what it should solve and requires human inquiry to determine what is good and what needs solving. Hence, AI is just a contemporary extension of the human tradition of using tools to extend our capabilities. It amplifies the potential in processing information, solving complex problems, and generating new ideas. Integrating AI into various domains is not about replacing human effort but rather enhancing it, allowing for more efficient workflows, deeper insights, and the exploration of previously inconceivable solutions.

The belief that AI shares essential human qualities is so widespread nowadays this is no longer viewed as an assumption that calls for scepticism. We use sentience and consciousness to describe software while creating metaphorical contrivances like ‘the brain is like a computer’ or ‘the mind as the software of the brain’. However, following this trend, humans put themselves into a box without realizing its limitations. Humans are not computers, and computers are not humans.

The Issue of Building Trust

Since algorithms and its associated data can be used to forecast our behaviour, it is  imperative that one can trust the results, especially in decision-making situations. In the past AI systems used to be built with relatively simple algorithms. These rudimentary pieces of software were able to make some approximate estimates, learning from the data they were fed to forecast behavioural patterns. These predecessors to a more advanced AI are often still used in applications such as sales forecasting or risk scoring. In contrast state-of-the-art AI, such as OpenAI’s ChatGPT or Generative AI, has become extremely powerful, able to generate text, images and videos that often rival expert writers, designers and producers of videos. These developments come at a cost, however. While older models are less accurate, they are much more interpretable and for this reason quite trustworthy compared to the so-called ‘Closed-Box’ models that power most of today’s AI-systems. Developers of these models such as OpenAI are hesitant to disclose the inner workings of their systems. Yet, as a positive argument to promote their usage, the models are designed to be highly effective in low-risk scenarios, but almost impossible to trust in high-risk situations such as healthcare, criminal justice, finance and more. Hence, we need to rebuild trust not only towards the developers of AI-systems but also in respect to the applications and the technology used. Without trust the potential value of these systems remains marginal.

Explainability: First Steps Moving beyond Closed-Box Algorithms

One of the first steps to correct the closed-box problem is to develop AI systems that are ‘interpretable’. With the current architectures that power Large Language Models (LLMs) such as ChatGPT, we are not able to peek under the hood of these algorithms. We are not able to understand the decision-making process that led the AI-system to come up with a specific answer or recommendation. This poses significant issues in high-risk scenarios where sound decision-making is key. In the healthcare sector, for example, we need to understand why the model diagnosed a patient with a specific disease and we need to understand the considerations and the algorithms behind such a diagnosis. The doctor reviewing the algorithm’s output must be able to verify that the model has reached a conclusion that is complementary to his years of experience practicing medicine. Generally speaking, we need to be able to ask the AI-system ‘why’ it came up with such conclusions and understand the logical process it went through, especially along these sensitive applications. By inspecting the flow of algorithmic procedures, we might discover flaws in the thought process which need to be corrected. Model explainability is just a part of the quest for AI we can trust. Since AI-systems are only as good as the data they use, it is  key that we ensure that the data used to train such models is of the highest quality. This implies that we have access to high-quality data sources that can provide accurate and factual information. However, it might not be easy to get such information in real-life situations. Most data and information the algorithms digest are subject to various limitations such as copyright, for example. Solving this problem, AI-systems tend to show bias not because of inherent malicious behaviours but rather because they have been trained on historically biased information which might discriminate against an individual based on ethnicity, gender or other protected attributes such as his cultural or educational background . There are several techniques that can be used to mitigate and ideally remove bias.  Analysing the step-by-step prompting of a user’s input answering his query with ChatGPT, for example,  might provide the necessary safeguards to avoid discrimination. This, however, is the responsibility of the provider of the tool to be used.


David Hume, the Scottish Philosopher and Historian, once famously said, “Reason is the slave of passions.” The point is that we are not merely reasoning with our minds. We have goals, passions, and bodies. AI will never supplant the emotional behaviour unique to humans that motivates us to reason, solve problems, seek justice, acquire knowledge  and to communicate with others via the internet. One of the most important problems we face today is that an imagined threat justifies any action when framed in such a manner. Hence, one of the most significant hazards humanity is facing is not AI and its related technologies but the power we give to someone else to answer the question, who we are.

Leave a Reply

Your email address will not be published. Required fields are marked *