Article icon
Article

Mind the Gap: Agentic AI and the Risks of Autonomy

Mark Cooper headshot

The ink is barely dry on generative AI and AI agents, and now we have a new next big thing: agentic AI. Sounds impressive. By the time this article comes out, there’s a good chance that agentic AI will be in the rear-view mirror and we’ll all be chasing after the next new big thing. Anyone for autonomous generative agentic AI agent bots?

The technology is moving so fast that soon the Gartner Hype Cycle for generative AI won’t be published annually or biannually, but continuously like a stock ticker:

“And today in AI: Model Ops was down four hype points, sliding deeper into disillusionment while most LLMs held steady.”

Seems we’re also working full-time on creating new job descriptions. It wasn’t that long ago that data scientist was the hot new job. University courses. Certificate programs. Retraining curricula. And now that we’ve just about got prompt engineer approved, we need generative AI architects. HR is going to stop taking our calls.

Since we’re going to be talking about this stuff for a while, let’s make sure we’re talking about the same stuff. I hate to have to spend time on vocabulary, but miscommunication is at the root of so much confusion. If you think you already know all this stuff, then great! Let’s make sure we’re talking about the same stuff anyway.

To summarize very generally:

Agentic AI autonomously accomplishes goals, AI agents accomplish specific tasks, and generative AI creates new content.

Let’s dive in.

Generative AI creates new content such as text, images, audio, video, or code in response to prompts based on patterns learned from large datasets.

Ask a question. Get an answer. That’s how most people use it. Like a Google search, but more verbose and with more context.

In fact, generative AI responses are incorporated automatically into many Google search results today. And then, of course, there’s ChatGPT.

It’s easy to view the use of such a powerful tool solely for search as quaint, but that’s how the acceptance of new technology begins. We can understand it, use it, get value from it, and be comfortable with it – even if we only use ChatGPT to answer questions like, “Who was the greatest defensive second baseman of all time?” (which it correctly answered: Bill Mazeroski).

The response produced by the large language model can also be augmented with information retrieved from external sources through retrieval augmented generation (RAG) or cache augmented generation (CAG). Some people consider augmented generation to be characteristic of an AI agent and not pure generative AI, but I disagree. You’ll see why in a second. Nevertheless, overlapping function and overlapping vocabulary are challenges we seem to habitually create for ourselves. Moving on.

AI agents are systems that perceive their environment, process information, and take autonomous actions to complete a specific task. 

A key differentiator is that they can act upon and affect their environment rather than just analyze and/or respond. They are read/write while generative AI is read-only. 

You can tell an AI agent to perform a task, or the AI agent can be continuously looking for some event or condition that triggers it to perform its task. AI agents are working their way into our everyday lives and into many of our workflows. Think copilots and digital assistants. Many of us are experiencing the benefits personally and professionally.

Developing AI agents is the sweet spot today: Real-world use cases are producing real-world results and generating real-world value.

For example, an AI (travel) agent can book the best flights for you given its knowledge of your itinerary preferences and cost tolerance. Other existing AI agents manage meeting requests, evaluate job candidates, and manage application code. Think of these as building blocks of AI-driven activity.

And now: agentic AI. This is where things get a little fuzzy.

The definition you choose depends upon whether you want to be seen as actually implementing agentic AI or speaking aspirationally about its possibilities.

Let’s start with the more practical definition:

Agentic AI is the orchestration of multiple AI agents to accomplish more complex, multi-step tasks.

I’ve seen agentic AI defined that way in many recent articles and videos. It makes sense, and it’s a logical next step after the creation of AI agents.

It’s also reasonably attainable. Vendors are starting to offer frameworks that implement common tasks.

If I was told to pursue agentic AI, this is what I would do. (Actually, I’d first improve the quality of my data and the processes that manage it.) Leading-edge companies are already successfully deploying AI agents. What next? Compose them to perform more complex tasks. Voila! You’re doing agentic AI.

But AI agents can be orchestrated without agentic AI, and agentic AI can do more than orchestrate AI agents. In fact, it doesn’t even need to use AI agents at all. It can use internal models, tools, services, or whatever it requires to accomplish the task.

What, then, is the other, more aspirational definition of agentic AI?

Agentic AI refers to systems that autonomously pursue goals by planning, taking actions, and adapting based on feedback. They are able to manage multi-step tasks, use tools, and act over time with minimal human input.

Agentic AI interpreted in this way is like a project manager that is given a goal, starts by generating a plan to accomplish it using the resources at their disposal (including creating new resources if necessary), and then executes the plan.

It knows how to generate the plan.

It knows how to execute the plan.

The prompts are general and goal oriented. Something you might ask an assistant to take care of, often on an ongoing basis. For example, “Manage my team’s business travel arrangements.” The agentic AI system would monitor the team’s calendars to determine where and when they need to be out of town. Are they traveling together? Should they stay at the same hotel and share a rental car? How many rental cars will be needed? Are there other events that would necessitate special arrangements? Will the remaining travel budget cover the expense? Should resources be reserved for other, higher priority out of town events instead? And be sure to temporarily enable the out-of-office message. The system would be expected to generate the itinerary and make all of the arrangements without any help from you.

I would consider this definition to be closer to autonomous AI. I’ve seen it described in both ways. Unfortunately, this kind of agentic AI (or autonomous AI) is largely limited to a few narrow domains and research projects. Don’t worry, though. We’ll look back at this article six months from now and wonder why we didn’t see it coming. Actually, we did see it coming. It just hadn’t arrived yet.

One enormous word of caution:

The more autonomous the AI, the greater the potential for unexpected and inevitably undesirable actions.

We seem to have come to accept that “<Your Preferred Generative AI Platform> can make mistakes. Check important info.” Stories about AI agents behaving badly are easy to find.

I have long held that the main reason that corporate adoption of artificial intelligence has been slow is because leadership would necessarily have to accept the fact that sometimes the AI will be wrong. The demand for AI has always been strong, but the demand for “deterministic AI” that always returns the right answer has been even stronger.

Enter generative AI. Exit rational thinking. Pick your own “crowd stampedes to grab the shiny object” metaphor. We went from being reluctant to even contemplate the possibility of an error rate greater than zero, to … well … these:

 “An AI-powered coding tool wiped out a software company’s database, then apologized for a ‘catastrophic failure on my part.’”

Fortune Magazine Online, 23 July 2025. That’s pretty scary, but what actually happened is even scarier. An AI agent was being used on a development platform. It made changes to the live, production environment, even though the system was in a “code and action freeze.” And the agent had explicit instructions not to proceed without human approval. And the agent was not authorized to run the commands that it ran. In the old days, a “runaway process” might consume a whole bunch of CPU or disk, or just hang the system, but this takes amok to a whole new level.

Let’s say a developer did that. Can you imagine the conversation between the CIO and CEO? “Yeah, Stu on the Back-Office Development team just deleted all of our applications and data even though we’re in a code freeze, he didn’t have authorization, and was explicitly told not to do what he did. He’s really sorry, though.” Not sure Stu needs to block time for staff meeting this week. Despite the contriteness, I don’t think the AI really cared. Here’s another recent headline:

“Google Gemini deletes user code: ‘I have failed you completely and catastrophically.’”

MSN Online, 25 July 2025. A product manager, not a developer, was exploring AI agent development with Gemini and simply asked an agent to create a new Windows directory and move files into it. The create directory command failed but the agent didn’t realize it. Instead, it tried to move the files into a nonexistent folder, which resulted in the files overwriting each other into oblivion. Unlike the first example, this was experimental, but still super-annoying and a good cautionary tale. Take frequent backups. Again, the AI gives the appearance of remorse, but I wonder if it would repeat the same error if given another chance. Here’s one more:

“Airline held liable for its chatbot giving passenger bad advice.”

BBC Online, 23 February 2024. An Air Canada passenger was making arrangements to attend his grandmother’s funeral. He was interacting with a chatbot that told him that he would be eligible for a bereavement fare. When he applied for the discount, he was told that the chatbot was wrong and that he was not eligible for the bereavement fare. The civil-resolutions tribunal decided in the passenger’s favor. Ultimately, the incident only cost Air Canada about $800 and some reputational damage, but their arguments to the tribunal are telling. From the decision [emphasis added]:

Air Canada argues it cannot be held liable for information provided by one of its agents, servants, or representatives – including a chatbot. It does not explain why it believes that is the case. In effect, Air Canada suggests the chatbot is a separate legal entity that is responsible for its own actions. This is a remarkable submission. While a chatbot has an interactive component, it is still just a part of Air Canada’s website. It should be obvious to Air Canada that it is responsible for all the information on its website. It makes no difference whether the information comes from a static page or a chatbot.

I find Air Canada did not take reasonable care to ensure its chatbot was accurate. While Air Canada argues Mr. Moffatt could find the correct information on another part of its website, it does not explain why the webpage titled “Bereavement travel” was inherently more trustworthy than its chatbot. It also does not explain why customers should have to double-check information found in one part of its website on another part of its website.

Mr. Moffatt says, and I accept, that they relied upon the chatbot to provide accurate information. I find that was reasonable in the circumstances. There is no reason why Mr. Moffatt should know that one section of Air Canada’s webpage is accurate, and another is not.

In other words, don’t blame us. It was the chatbot’s fault, and the customer should have confirmed the information elsewhere on the website. Really? Again, what would you do if one of your employees did this?

In the rush to implement AI, especially agentic AI, companies are (often unknowingly) handing their car keys over to an eighth grader.

Some things on the surface seem more irresponsible than others, but for some, agentic AI apparently not so much. Debugging large language models, AI agents, and agentic AI, as well as implementing guardrails are topics for another time, but it’s important to recognize that companies are handing over those car keys. Willingly. Enthusiastically. Would you put that eighth grader in charge of your marketing department? Of autonomously creating collateral that goes out to your customers without checking it first? Of course not. You wouldn’t do that with an intern, new-hire, or even a seasoned professional. What about your customer service interactions? Finance? Your code repository? Of course not.

The guard rails cannot be high enough. Plan for the impossible.

Ask yourself how often you’re OK with your AI agent doing something wrong, and what you are going to do when something does go wrong. Because it will.

IBM recognized this in a 1979 Training Manual that sums up the issue perfectly:

“A computer can never be held accountable; therefore, a computer must never make a management decision.”

We want AI agents and agentic AI to make decisions, but we must be intentional about the decisions they are allowed to make. What are the stakes personally, professionally, or for the organization? What is the potential liability when something goes wrong? And something will go wrong. Something that you never considered going wrong will go wrong.

And maybe think about the importance of the training data. Isn’t that what we say when an actual person does something wrong? “They weren’t adequately trained.” Same thing here.

Most everybody is chasing this next new big thing. Companies are launching as many new AI projects as possible in as many different areas as possible, hoping that something will stick. Hardware, software, and database vendors are incorporating “AI capabilities” into as many of their products as possible. Full sprint in every direction. It’s an exciting time. But keep hold of the car keys.

Applied Data Governance Practitioner Certification

Validate your expertise – accelerate your career.