Supporting AI in the Enterprise

What it will take to do it right

2001: a space odyssey, inside “HAL’s brain”

Looks the same but isn’t

AI / ML doesn’t use significantly different (or more) resources than any other current technology. Even though they can have specialized needs, they primarily leverage the enterprise’s existing services, interfaces, and data stores.

From an infrastructure perspective, integrating AI / ML in the enterprise is pretty much exactly like almost everything we’ve integrated previously. That’s dangerous, because while the infrastructure and actual integration is similar, the resulting outcomes are unlike everything else we’ve ever done.

We got a taste of this with DevOps. The joke was companies wanting to “buy the DevOps” (like, in a box) when, of course, implementing DevOps is often more cultural than technical. AI has technical elements, of course — but there are so many impacts so far beyond traditional technical solutions, assuming it’s anywhere near similar is a trap.

People believe the computer

It was amazing. If I worked up a quote with paper and pencil, people would negotiate every little detail. They would argue about the price of each and every option. But when the computer printed out the quote, with the same numbers and the same details, nobody said a word. They always just accepted the quote, no questions asked.

The enterprise has so much going on, it’s already hard to tell what decisions are good and what decisions are bad. People believe technology; partially because technology up to this point has typically been very deterministic and relatively easy to verify. That makes it easy to trust.

Machine learning and AI, not so much.

The Key: Ensuring Model Quality

Who’s going to question the enterprise’s super-smart AI?

Even if (and especially when!) the outputs don’t make sense, it will be easy to justify the outputs as the model “seeing more” than the human analyst sees. This is, of course, true; machine learning has an incredible capacity for absorbing data. But if it’s not the right data, if it’s not the whole picture, the recommendations the model will produce will likely be some level of wrong.

With deterministic programming, an algorithm being wrong is often obviously wrong — with these models, it could be just wrong enough that things begin to veer imperceptibly off course, and go unnoticed for a very long time.

Which is more harmful: A corporate leader who makes bad decisions, or a never-exhausted, always-available machine learning tool that will happily make bad (or even slightly off) recommendations to anyone (or any service) that asks it for an opinion?

To make matters worse, getting the model wrong is really easy to do, and small pieces of data added or missing from the model’s training data can have huge impacts.

My favorite example from science fiction is from 2001: A Space Odyssey, where an advanced AI installed on a spacecraft (HAL 9000) decides to murder the human crew, while an identical machine on earth (SAL 9000) retained the more desirable non-murderous behavior.

How did that happen?

HAL was provided a few pieces of information that it was required to keep secret, while SAL was not provided that information or direction. That’s it. Everything else about the machines and the models they ran were identical.

My favorite example from real life is an effort that attempted to create a machine learning model to identify skin cancer, but accidentally built a ruler detector instead, because many of the photos in the training data set had rulers in the picture when the area was known cancerous.

Of course there’s plenty more stories about how training data impacted the model in unexpected ways, both good and bad. There’s also the implementation side of the machine learning itself, which also impacts the model in big ways. There’s a lot to worry about when building an AI for the enterprise, and a lot of ways to get it wrong.

Ensuring model and algorithm accuracy / confidence is the challenge when it comes to machine learning in the enterprise.

How do we ensure AI / ML quality in the enterprise?

There must be focused, continuous model development.

The data available to all models will evolve as time goes on, trends will emerge, internal efforts will spin up and spin down, and the nature of the customer will change. Each model needs to adapt to these new situations to ensure it stays useful and continues to provide useful recommendations; a stale model without a continuing evaluation of accuracy is not only not useful, it’s potentially harmful.

Additionally, the enterprise that does not continually adapt and improve its models and expand its cognative services is going to quickly find themselves lagging behind the rest of their industry. Since machine learning is additive (improves significantly as more data is considered), lagging behind industry is an incredibly difficult situation to recover from.

With other technologies it was possible to catch up by spending more money; with AI / ML that’s less possible, because it’s less about infrastructure and more about data. Lost time is lost data, and that’s hard to replace.

Machine learning is not a path to a door, but a road leading forever towards the horizon. Continuous improvement of the AI / ML solutions in the enterprise isn’t just a good idea, it’s an imperative.

Change is the thing that gives AI / ML it’s present, continuing, and improving value.

What this means

Great internal model development efforts and deployment practices.

Great development and versioning of algorithms and ensemble approaches.

Great versioning of training data,

Great versioning of the resulting trained models, and

Great tools to test, record, and compare built models against each other.

Some enterprises have tackled the challenges of building good development efforts already — but this is far more difficult. The ongoing development effort to support AI / ML in the enterprise has to be much better and go further than traditional development efforts. It’s got to be outstanding.

Unlike traditional development efforts, AI / ML is not obviously deterministic. There will be outcomes that won’t be expected, and being able to test for, find, isolate, and investigate these outcomes will require very strong artifact management and testing tools / processes. It will require tools and processes we’ve only begun to create.

Then, after we’ve isolated the unexpected outcomes, we’ll have to compare them to other models to see if the recommendations the model being tested makes are good recommendations, or bad ones. AI / ML calls the very concept of traditional pass / fail testing into question. Now we have to be looking at confidence interval / accuracy in addition to all other traditional aspects of quality.

And since AI / ML is driven by the enterprise’s data, the enterprise will need to do this work internally or risk sharing critical (and immensely valuable) data with an outside entity.

The Challenge

Many model development efforts in use today are experiments and hand-tooling. These approaches work, but they are not repeatable, enterprise-ready production-class tools and methods. Discovery and refinement of methods and tools covering all aspects of model development and life cycle will need to happen; which makes all of the above even more difficult to implement — especially for the enterprise.

It has to happen anyway.

That’s a huge ask, but those that are willing and able to step up to it will approach escape velocity faster and quickly become difficult-to-catch leaders in industry.

Co-Founder, Liquid Genius