Across the world, businesses and governments are turning to artificial intelligence for answers. Unfortunately, AIs can’t yet explain how they reached those answers.
In machine learning, the results often come out of a black box. The systems teach themselves by analyzing vast quantities of data, testing and retesting results by trial and error until they reach a conclusion: to approve a mortgage loan, identify fraud, or make a medical diagnosis. The problem is that the machine’s human minders can’t always tell how it came up with its conclusion.
Consumers, digital‑rights advocates and companies of all stripes are calling for AI that, in effect, shows its work. A small cottage industry is cropping up to provide the necessary tools. Businesses that invest in AI need to maintain trust with their customers, employees, and shareholders and avoid legal tangles over AI‑based decisions. Everyone wants to know if the results are valid or based on hidden biases in the data or other errors. They’re looking for AI that can explain itself.
The quest for explainable AI is increasingly a legal issue. Article 22 of Europe’s General Data Protection Regulation (GDPR) gives individuals the right to object to decisions made by an automated process, such as the denial of a loan application by an AI‑based system.
While there are different interpretations of what the regulation requires, many legal experts say it also gives consumers the right to an explanation when AI delivers a decision with legal or other significant impacts.
“Customers have a right to ask ‘how did you arrive at this outcome?’” says Shefaly Yogendra, chief operating officer of UK‑based Ditto, one of several new companies offering tools that make AI processes and decisions more transparent. “If you rejected my loan application, why did you do it?”
Beyond algorithmic auditing
Machine learning can produce powerful results. In 2016, Alphabet’s DeepMind became the first AI system to defeat a world champion in the strategy game Go. It mastered the game by playing thousands of matches. A later version of DeepMind became even more proficient by playing against itself.
However, the old principle “garbage in, garbage out” applies to machine learning algorithms, which are only as good as the data you feed them. In a well‑known example, an MIT study found that facial‑recognition software from IBM failed to identify the faces of black women 35% of the time, compared to a failure rate of just 0.3% for white men. The problem was the limited set of data used to train the software. (IBM has since updated its database to be more diverse.)
Algorithmic auditing uses a variety of techniques to test whether an AI program has blind spots or other biases by looking for questionable patterns in the decisions the software produces. While auditing algorithms can help identify bias and other flaws in the data used to teach machine‑learning systems, they can’t help explain how decisions are reached. Other approaches are needed. Ditto, which sells explainable‑AI systems to healthcare, waste management, and other companies, bases its tool on technology known as symbolic AI.
Symbolic AI dates back to the 1950s. It uses natural‑language concepts to build large‑scale knowledge bases that map how different terms relate to each other. In finance, for instance, symbolic AI would recognize that “principal,” “interest,” “income,” and “default” are all factors in making a loan decision.
This ability can be tapped to explain the reasoning behind an AI‑based decision. Analyzing a loan application, the system could decide to reject the applicant and also tell the bank that it was doing so because one concept (income) couldn’t support another concept (interest payments).
“I am not giving this person a loan because they do not have sufficient income to cover the cash payments,” says Ryan Welsh, CEO of San Mateo, Calif.–based Kyndi, another explainable‑AI firm. “That’s an explanation.”
Kyndi’s machine‑learning tool mines thousands of documents and automatically extracts their key concepts. It then answers questions about its decisions in simple terms, understandable by people.
Ensuring AI compliance
British pharmaceutical giant GlaxoSmithKline turned to Kyndi to help it support data‑integrity efforts in its research work. The company’s research teams follow strict procedures to ensure that their work and results stand up to scrutiny, says Vern De Biasi, the company’s head of digital, data, and analytics in medicinal sciences and technology R&D.
De Biasi’s team ran an experiment to see whether Kyndi’s explainable AI system could speed up its quality‑assurance process. It used the tool to read and analyze reams of documents adapted from real‑world R&D work to create a knowledge model. It then used the model to search the documents and ensure compliance with standard operating procedures.
Occasionally, the system highlighted inconsistencies in the documents, such as typos and other errors. In one section of a document, a scientist might note that her experiment included a certain number of steps. In another part, the number might be different. This discrepancy would raise a flag with the quality team to discuss it further with the scientist.
The Kyndi software created a model that could be used to analyze new documents, flagging those that required a closer look. The tool identified documents with possible problems and also informed the quality team about possible inconsistencies, classifying each as likely, possible, or not likely.
“This would be a valuable tool to apply in a real setting because it found the inconsistencies that we inserted under experimental conditions. Not only that, but it helped to explain those inconsistencies, and sped up the process to find them,” De Biasi says.
By contrast, a non‑explainable system might give a document a quality score without explaining its basis, which would not be very helpful.
Another explainable‑AI company, simMachines, uses a technique called similarity learning to produce predictions and describe how they were made. Similarity learning can be used to analyze structured data, like sales transactions, by quantifying the difference or similarity between two items. This makes it possible to identify the factors that contribute to algorithmic decisions or predictions.
American Express uses simMachines to look more closely when its AI system identifies a potentially fraudulent transaction. When looking for fraud, an AI system typically assigns probabilities. For instance, a transaction might be flagged if the system were 90% certain it was counterfeit. But not every case is that obvious. How should an AI system treat a transaction when it is, say, 60% sure? Or 40%?
In such cases, the system hands off decision‑making to a human reviewer. It also identifies the reasons why the transaction was flagged, noting similarities with other fraudulent payments.
If these applications are to succeed long‑term and survive legal and regulatory scrutiny, as well as customer skepticism, the AI will need to make the strongest possible case for its own decisions. Eventually, it will need to do so without human help.