From Data to Predictions to Decisions

Posted by Peter Rudin on 31. August 2018 in Essay

Picture Credit: Bookcover ‘Prediction Machines’ by Ajay Agrawal et al.


Recent developments in artificial intelligence (AI) and machine-learning in combination with large data-libraries significantly improve the quality and cost generating predictions. In some cases, prediction has enabled the full automation of tasks – for example, self-driving vehicles in which the process of data collection, prediction of behaviour and surroundings as well as necessary reactions are all conducted without human intervention. In other cases, prediction is a stand-alone tool – such as image recognition or fraud detection – that may or may not lead to further substitution of human users by machines. Thus far, substitution between humans and machines has focused mainly on cost considerations. Are machines cheaper, more reliable, and more scalable than humans? Doubts about the accuracy and quality of the data used by machine-learning so far has kept humans in the loop of the decision-making process. As the quality of sensor-information improves and data is collected at its source without human intervention, one can expect further improvements of predictions to support decision-making.

How predictions are made

Prediction is one of the possible objectives of mathematical modelling in fields such as environmental management, healthcare, economics and finance. A model can be viewed, at the simplest, as just a concise summary of knowledge about a system and its processes. Prediction based on a model has at least two common attributes:

  1. In scientific hypothesis testing, predictions are compared with observation. The prediction model is usually a law or principle, fixed in form to leave little room for interpreting the results.
  2. In “what if?” experiments, elaborate simulation models are used, for instance, to provide weather forecasts. In this context, prediction has the aim of aiding choice, either between human decisions by assessing their consequences or between models by comparing their prediction performance.

While these prediction techniques are not new, recent advances in machine-learning support predictions by programming computers to “learn” from data. In the absence of the ability to predetermine the decision rules, a data-driven prediction approach can model many mental tasks. For example, humans are good at recognizing familiar faces, but they would struggle to explain and codify this skill. By connecting data of names with image data on corresponding faces, machine learning solves this problem by predicting which image data patterns are associated with which names. The more training-data is available the better the accuracy of the prediction. Better predictions will potentially improve decision making, but as experience shows, decisions by humans are not necessarily logical.

The process of decision-making

Decision-making is the process of identifying and choosing alternatives based on the values, preferences and beliefs of the decision-maker. Decision-making can be regarded as a problem-solving activity terminated by a solution deemed to be optimal, or at least satisfactory. Decision theory (or the theory of choice) is the study of the reasoning underlying the choices we have. It can be broken into two branches: normative decision theory, which gives advice on how to make the best decisions, given a set of uncertain beliefs and a set of values; and descriptive decision theory, which analyses how existing, possibly irrational arguments lead to a decision. 

Normative decision theory is concerned with identifying the best decision to make, modelling an ideal decision maker who can compute with perfect accuracy and is fully rational. The practical application of this prescriptive approach (how people ought to make decisions) is called decision analysis, and is aimed at finding tools, methodologies and software to help people make better decisions. With the continuous improvements in machine-learning and deep-learning technology, the quality of predictions enhances the decision-making process defined by the normative decision theory.

Descriptive decision theory is concerned with describing observed behaviours under the assumption that the decision-making individuals are behaving according to some consistent rules. In recent decades, there has been increasing interest in what is sometimes called “behavioural decision theory” and this has contributed to a re-evaluation of what rational decision-making requires. The prospect theory of Daniel Kahneman (Bestseller: Thinking, Fast and Slow) and Amos Tversky renewed the empirical study of economic behaviour with less emphasis on rationality. Kahneman and Tversky found three regularities in actual human decision-making: (a) “losses loom larger than gains”; (b) persons focus more on changes in their present situation than they focus on radically changing it; and (c) the estimation of subjective probabilities is severely biased by ‘fixed, subjective ideas’.

A simple tool to apply AI in decision-making

Better predictions matter when one makes decisions in the face of uncertainty. In teaching this subject to MBA graduates at the University of Toronto’s Rotman School of Management, Ajay Agrawal et al. have introduced a simple AI assessment tool they call ‘the AI Canvas’. Each space on the canvas contains a requirement for machine-assisted decision-making.

The top part of the canvas describes the critical aspects of a decision:

Prediction: What do you need to know to make a prediction?
Judgement: How do you value different outcomes of predictions?
Action: What are you trying to do, following the prediction?
Outcome: What are your metrics for task success?

The bottom part of the canvas describes the data related considerations:

Input: What data do you need to run a predictive algorithm?
Training: What data do you need to train the predictive algorithm?
Feedback: How can you use the feedback to improve the algorithm?

To get started with AI, the challenge is to identify the key decisions where the outcome is tied to uncertainty. Filling out ‘the AI Canvas’ will help to clarify what AI can contribute either by reducing the cost of decision-making or improving its performance.

Issues in Healthcare

While experts are mostly in agreement about the benefits AI will provide medical practitioners — such as diagnosing illnesses very early on and speeding up the overall healthcare experience — some doctors and academics are wary we could be headed in the direction of data-driven medical practices too fast. A recent report by a health-focused publication cited internal IBM documents, showing that the tech giant’s Watson supercomputer had made multiple “unsafe and incorrect” cancer treatment recommendations. According to the article, the software was trained only to deal with a small number of cases and hypothetical scenarios rather than actual patient data. “We created Watson Health three years ago to bring AI to some of the biggest challenges in healthcare, and we are pleased with the progress we’re making,” an IBM spokesperson told CNBC. “Our oncology and genomics offerings are used by 230 hospitals around the world and have supported care for more than 84,000 patients, which is almost double the number of patients as of the end of 2016.” This learning curve is further supported by the ongoing improvement in sensor-technology.

As one example, researchers at Google are examining images of an individual’s retina to predict the likelihood that this individual will suffer a heart attack or stroke within the next five years. Google, which has presented its findings in February 2018 in Nature Biomedical Engineering, says that such a method is as accurate as predicting cardiovascular disease by more invasive methods that involve sticking a needle in a patient’s arm.  Using the retinal image, Google says it was able to quantify this association and to accurately predict 70% of the time which patient within five years would experience a heart attack or other major cardiovascular event, and which patient would not. Those results were in line with testing methods that require blood be drawn to measure a patient’s cholesterol. Google used models based on data from 284,335 patients and validated on two independent data sets of 12,026 and 999 patients. Google’s research team is convinced that further tests will improve prediction accuracy.

“Pattern recognition and making use of images is one of the best areas for AI based predictions, “says Harlan M. Krumholz, a professor of medicine (cardiology) and director of Yale’s Center for Outcomes Research and Evaluation. It will “help us understand these processes and diagnoses in ways that we haven’t been able to do before. Sensors and a whole new range of devices will help us to improve the physical examination and more precisely improve our understanding of diseases and pair it with treatments.”


Data acquisition, now a multi-billion-dollar industry covering everything from trading to hospitality, implies that the amount of personal information that can be collected by machines has ballooned to an unfathomable size. In healthcare the phenomenon is being regarded as a breakthrough for the mapping out of various diseases, predicting the likelihood of someone becoming seriously ill and examining treatment in advance. But concerns over how much data is stored and where it is being shared are proving problematic. Take DeepMind, for example. The Google-owned AI firm signed a deal with the U.K.’s National Health Service in 2015, giving it access to the health data of 1.6 million British patients. According to the agreement, patients’ health data was handed over to DeepMind to improve its programs’ ability to detect illnesses. It led to the creation of an app called Streams, aimed at monitoring patients with kidney diseases and alerting clinicians when a patient’s condition deteriorates. But last year, the Information Commissioner’s Office ruled that the contract between the NHS and DeepMind “failed to comply with data protection law”, and the project had to be stopped. The ruling reflects the sensitivity of personal data protection especially when health issues are at stake. A trusted, independent party that acts as mediator between the patient and the healthcare service provider to securely validate and anonymize a patient’s data is a must to advance the potential of AI supported healthcare.

Leave a Reply

Your email address will not be published. Required fields are marked *