How to hold your AI accountable

Key steps to ensure machine learning models perform to expectations

ai capabilities

Artificial intelligence has made exciting leaps in recent years. It can read, write, see and speak. It can solve many problems that humans can’t. AI is a powerful technology, but not so powerful as to be beyond reproach.

As AI adoption expands in the enterprise—50% of companies surveyed by McKinsey in 2020 said they had adopted AI in at least one business function—leaders must define standards and expectations for AI applications and ensure they are met, just as they do for human employees.

Machine learning algorithms, after all, are making more and more decisions that impact customers, employees, and business results. Executives should scrutinize how machine learning models frame and make those decisions, identify any machine biases that might skew their analysis, and in general, ensure they do what they are hired to do.

In my experience as an analytics leader in large organizations, making AI accountable in the enterprise starts with a handful of core questions for business and technology leaders.

1. Is AI the right fit for the job?

Just as companies must evaluate whether a particular job candidate is the right fit, they need to determine whether AI is the right approach for a particular use case. That depends on four factors:

  • How important is the problem? Developing the right AI models can require significant time and resources. Is this a problem that can significantly reduce costs or increase revenue?
  • Do you have the right data? AI models tend to perform best when they can draw upon large volumes of carefully curated data.
  • Does this problem scale? The bigger the scope of the problem, the greater potential return you are likely to see on your AI investment.
  • Can you put the results to work? Organizations need to develop a playbook that plots the actions it will take from predictions AI will make.

2. Can the AI model explain itself?

It’s not enough to analyze the data and make a prediction. Like any business leader, the AI model needs to be able to explain how it arrived at a particular prediction. Or, if it can’t, a surrogate model, which takes the AI model’s predictions and reverse-engineers them to determine how the model arrived at them, needs to do the same.

The ability to explain how a model works helps to establish trust in decisions. For example, AI developers might create a model that predicts the interval between a customer’s initial investment in a product and the point at which they begin to realize its value. If the gap is too great, customer satisfaction will suffer. By analyzing usage, engagement, and training data, AI models can generate a time-to-value score, identify any issues that may be impacting adoption, and provide recommendations for mitigating them.

[Read also: What is data science?]

3. How is the model performing?

Every employee participates in periodic performance reviews and, when necessary, improvement plans. Likewise, AI models need to be constantly evaluated and refreshed to counteract drift, which can happen when the data changes over time and the model doesn’t adapt to those changes.

For example, companies generally can’t evaluate the accuracy of an AI prediction until the predicted result has (or hasn’t) come to pass. But AI models can be compared to historical data to get a rough idea of their accuracy. If they’re off by a significant margin, an intervention can be staged. Because models are likely to drift as the data changes, their accuracy must be periodically monitored and managed.

A classic use case for machine learning is to identify customers at risk for churn. Because certain market factors are unpredictable—companies may go bankrupt, merge with competitors, or be impacted by larger economic factors such as a global pandemic—these models must be revisited every quarter and adjusted as needed.

4. Can you control for bias?

Just as it’s impossible to eliminate biased decision-making in humans, AI models can develop biases from incomplete data training sets or a less-than-rigorous selection of variables. In the human world, bias is counteracted via learning, awareness, HR policies, and performance reviews. In the AI world, it’s done through algorithmic auditing or statistical analysis.

If you are building a model that predicts when a sales deal is likely to close, you need to take special care to avoid introducing variables—such as, say, the experience or history of a particular sales rep—that could inadvertently bias the prediction.

5. Are you encouraging diversity?

Creating an “ensemble” of independent base models can reduce the overall error rates of your AI deployments in the same way hiring people from diverse backgrounds can bring new ideas into the organization.

Let’s say you wanted to use machine learning to identify likely sales prospects. You could build a model based on the number of times a person visited your company’s websites and in-store purchases. But you’d probably get more accurate results if you combined that with another model that analyzed customer demographics and product trends, a third model that looked at all the above signals plus their exposure to marketing messages, and so on. Combining different models into an ensemble compensates for weakness in any individual model and can produce more accurate results.

Responsibility for the success or failure of AI deployments ultimately comes down to people.

The key is diversity: Using a separate set of training data for each model, changing the attribute set of each model, or creating new algorithms and parameters will all make the ensemble approach more effective.

Responsibility for the success or failure of AI deployments ultimately comes down to people. Business and technology leaders must take responsibility for the AI models that influence decisions and must be accountable for the results. Every model designer must be able to explain how the model works, and adjust it if results don’t meet expectations through a rigorous process and policies regarding the accuracy, bias, ethical use of data, and use cases.

As with hiring new employees, deploying AI at scale is only the beginning of the journey. To maximize AI’s potential, companies must continually monitor and improve its performance to generate the greatest value for the enterprise.