Definitely, Maybe Agile

Adopting new ways of working like Agile and DevOps often falters further up the organization. Even in smaller organizations, it can be hard to get right. In this podcast, we are discussing the art and science of definitely, maybe achieving business agility in your organization.

All Episodes

Definitely, Maybe Agile

AI Agents: Friend or Foe?

September 11, 2025 • Peter Maddison and Dave Sharrock • Season 3 • Episode 190

When should you let AI agents loose on your processes, and when should you keep them on a tight leash? Peter and Dave explore the messy reality of using agentic AI for process improvement.

They dig into why the processes you can easily map might not be the ones where AI agents add the most value. From recruitment pipelines that need human intuition to DevOps workflows that demand zero variation, not every process is created equal when it comes to AI intervention.

This week's takeaways:

Categorize your processes first. Look at your processes and start sorting them. Some need to eliminate variation (like DevOps deployment pipelines), while others benefit from exploring the edges and finding creative solutions.
Not all processes are equal when it comes to AI. There are many ways AI can help improve processes, but you need to think about whether you want to reduce variability or increase intelligent flexibility in each specific case.
Train AI to know when to hand off. What you want AI to do is recognize when it can't handle something and pass it to the right system - whether that's a math library for calculations or a human for complex decisions.
Understand the difference between consistency and exploration. DevOps spent years eliminating variation to create stable, repeatable deployments. Other processes might actually want that variation because it gives you something unusual and valuable.

If you're wrestling with where to apply AI in your organization without breaking what already works, this episode offers a practical framework for thinking through the trade-offs.

Resource:

Ethan Mollick's "The Bitter Lesson versus The Garbage Can": https://substack.com/home/post/p-169199293

Questions or thoughts? Reach us at feedback@definitelymaybeagile.com

Peter [0:04]: Welcome to Definitely Maybe Agile, the podcast where Peter Maddison and David Sharrock discuss the complexities of adopting new ways of working at scale.

Dave [0:13]: Hello! We're back again, Peter. It's good to chat and catch up. I think we've been having a bit of a conversation beforehand - this is going to be an interesting topic.

Peter [0:23]: I don't know where it's going to go, but it's very much of the moment around process improvement and AI agents and how they play nice or don't play nice together.

Dave [0:33]: Yeah, it's an interesting one. I think you and I were voicing some different perspectives, maybe based on what we've read or seen or done or tried in various different cases. We're wondering about what the right way of approaching this is. Why don't you kick us off?

Peter [0:53]: Well, like every organization, we're looking at some of the processes we have internally and trying to figure out where agentic AI is going to be able to help us reduce time and move into different areas. Because we're getting some of these processes taken care of automatically.

As you start exploring that, one of the things that struck me is - in order for us to improve a process, one of the first things we do is map that process. And there's a lot of the things when we're mapping a process... We've done value stream optimization, we've done process mapping in many different contexts. The map that you end up with - we talk about this many times - the map is not reality. It's a model of reality and it's missing certain things. Those things that it's missing, I wonder, are actually where we want our AI agents to be operating, but we're not capturing them in the process map.

Dave [1:55]: Yes, well, because all models are wrong and some are useful, right? It's that age-old problem that it's an abstraction of the actual underlying system. The underlying system is far more complex than we can possibly build in a model. No matter how many elements we put on it, we're never going to get an entirely accurate model of what the underlying system is doing - especially if there are humans involved.

Peter [2:17]: I'm just thinking of a recruitment pipeline. There's a step in there that says the subject matter expert might review, or even the HR specialist might review, a range of different resumes to evaluate which ones go to the next state. That process - we can articulate it pretty cleanly as a review process. We might be able to come up with a list of things we're reviewing against, like a job description and a few things that we look for for a good fit in our organization.

But I also know that if you and I were recruiting for a particular role and reviewed a bunch of resumes together, you and I would agree on a lot of them. But we'd also have some of these standout resumes that would be picked out for other reasons - for some sort of ambiguous kind of experience or feel for where that's going. You know, what that person is trying to communicate in their resume.

Dave [3:07]: Yeah, if there's a different piece - like we pull out a particular line because it resonates with us about something we remember from our experience. We think, "Oh, that might be interesting to talk to this person."

I feel like there are some processes where you don't want that ambiguity, right? You don't want it because that's the whole bit we're trying to get rid of. And yet there are other scenarios where we definitely want that ambiguity, but it's not captured, so we don't necessarily build it into an AI flow.

Peter [3:35]: Yeah, and some of it comes down to where the variance is - how much variance are we willing to tolerate? If the AIs are introducing that variance, which we might want in some cases... like if we're building something over the top of a system that's changing a lot, having AI there so it can interpret that change and decide what action to take could be valuable.

Dave [3:59]: It feels to me like there are processes that we want to be stable, repetitive, really consistent in how they work. And then there are processes that we actually want that variation, because that variation gives us something unusual, new. It gives us something that's on the edges, and which gives us value if the value is in the edge.

Peter [4:31]: Which is this fascinating piece. I think where the real complexity comes in is that a process may be both, or might change over time. So where do we want to apply these pieces?

One of the other bits - when you started talking about agentic AI, and agentic AI is like one AI feeding into another, into another, into another - essentially the context engine that starts to build up the solution. This is one of the pieces I was reading about this week which I thought was quite a good observation. From a DevOps perspective, we've typically looked to automate processes and practices for the exact reason to try and reduce the cascading errors that happen if you've got lots of handoffs between inaccurate systems. So the intent is to put automation in place to prevent that. If the automation switches to being AI agents, which are inherently inaccurate, then they're not necessarily the best solution.

Dave [5:31]: That raises... in my head, when you're describing DevOps, I'm thinking that's great. I want my invoicing process as an example. We have a contract. There's some sort of payment milestone in there. When we deliver it, we want that invoice to go out. But to your point, I don't want any hallucination around that invoice. I mean, maybe if it was in a positive direction, I don't know, but you look at something like that...

I find it really interesting that you're absolutely correct - if I'm looking for something that's consistent, I might truly want to automate it in the sense of getting some scripts going and just automating that process, with very little... in fact, eliminating the variation in that process. This is what DevOps did for the whole process of deployment and invoicing. If I look at invoicing and various things like that, I want to eliminate that variation. So maybe those categories of problems aren't suited to AI. Are we going in the right direction there?

Peter [6:27]: Well, what I wonder is... I think there's this whole piece around statistical process optimization and variance and staying within certain bounds. If something exceeds those bounds, do I take a different action than if it remains within those bounds?

So perhaps - just throwing it out there - if we start to think of AI as a technology and a tool alongside something like cloud, to be able to start to solve problems in different ways... If we have a process that, under normal circumstances, behaves the way it's intended to behave and solves problems because it's a static process which will behave within these bounds... if it runs into a problem where it can no longer figure out what to do, rather than waking up Bob at three o'clock in the morning, it calls out to the AI and says, "Hey, can you help me solve this?"

Dave [7:14]: Yeah, I like where you're going there, partially because I don't like the idea that those categories of processes are not suited to looking at AI. I think with AI approaches - and generally we're conflating lots of different ways of using AI - but if I look at the automation that comes in around some of those processes, I think you absolutely do want a little bit of that thought process behind it.

It's about putting the bounds in, understanding the boundaries within which I don't want the agents to be overthinking it versus the boundaries in which I do want them to think around things. That becomes quite interesting.

Peter [7:50]: Yeah, and maybe there's something there where it starts to rethink what the different processes look like. There's nothing to say you couldn't use AI to help you design that first process. But maybe, once it's designed, I don't need to go and reinvent the wheel. I want it to behave like this every single time. I don't want it to suddenly hallucinate and start making paperclips.

Dave [8:14]: I have no idea how paperclips come into your process improvement conversations, but... laughs

One of the other things that just jumps to mind is I was just reading Ethan Mollick's blog post this morning around "the bitter lesson." Actually, because we talk about process and capturing that and then informing some agents to be able to follow through on it, and the bitter lesson that Ethan Mollick was talking about is that actually what works better is if you just get out of the way and have that agentic AI stack really look at the entire process and model it towards a given outcome. Then it'll come out probably with a better understanding of the process or a better implementation of that process than anything that you or I could have assessed and created a process map for, with a bunch of decision trees in it, which is the natural way that we might look at it.

Peter [9:07]: Yeah, and that too is an interesting way. If you can get sufficient data - which is where you started around the missing parts of what some of our assessments did - yes, the AI is going to be able to design a better version of this than us, potentially, because it's going to pick up on a lot of those things. It's going to be able to pull in a lot of information and create the logic chain that's needed, with the right structure and prompting and other pieces that went into that.

Now I feel like we're on the edge of a precipice. I'm not sure how much further the conversation is going to go without us really stretching out over areas that I don't fully understand.

Dave [9:49]: It's interesting - there's not just a conceptual idea. I mean, you can see people doing this. There's definitely work that occurs that isn't easily capturable into a set of AI processes. There's also the snowflake problem - everybody believes that what they do is unique. So it has already seen this elsewhere anyway. So it can build up a model, based on observations elsewhere, of how that might have been done somewhere else. As long as it doesn't hallucinate some crazy thing or misinterpret an acronym somewhere, because your taxonomy for your company may be different internally to what it is in the general marketplace and all sorts of other fun things like that.

So there's a lot of training and logic, but these are all problems you can overcome. They're just things that take time to work through. I think there is that piece - you don't just string a bunch of these pieces together and fire them raw at a business problem and not expect problems. There are lots of things that have to happen to make that work.

Peter [10:57]: One thing that jumps to mind when you're describing it that way - and this is something that, again, is a feeling, I don't have a deep data set to look at this on - but certainly one of the things that I find, since marketing has really taken off, in the sense that so many of the posts that you and I are reading and looking at will be AI-derived in many cases... My interpretation of that is that it's a move towards a common center. I always think of it as a regression to a mean in terms of messaging.

I want that around, say, invoice processes and DevOps. I don't want that variation. But I wonder if we don't actually want some of that ambiguity to come through in many of those processes. If I think of the most powerful marketing messages that we've seen in the years that we've been exposed to them, the ones that stand out and resonate are unusual in some pretty significant way. There's a juxtaposition there that is not a mean interpretation of whatever it is that they're advertising, but is actually something unusual that catches the eye, catches the mind in some way.

Dave [12:31]: Yeah, the advertising campaigns that stick in your head and you remember... You know, Pepsi versus Coke or something like that. Although, I'm just going to say now, if I were to reel off some of the advertising campaigns that are stuck in my mind, if nothing else, you'll find they're mostly beer.

Peter [12:36]: Like the "What's up?" for example. Well, I remember the advert way more than I liked the beer.

Dave [12:44]: laughs Yes. So how do we pull this together or wrap it up in some sort of succinct, pragmatic learnings from our conversation here?

Peter [12:51]: So I think a couple of the really interesting pieces that we touched on around process improvement and agentic AI... There are a lot of ways in which AI can help you improve your processes in your organization, without a shadow of a doubt. I think we've expressed some interest in how best to apply that in a cascading way, where you've got agentic AI and AI queues feeding into each other - how that might work effectively in certain types of processes where you want to reduce the amount of variability, not increase it.

You may have a process that pretty much has very little variability because you've automated a lot of it. Introducing AI directly into that may cause more problems than it solves, potentially. But it partly will also depend on what are the underlying systems being managed and monitored, and where are the variance points. There are a lot of ifs in that, but I think that's one of the key nuggets we're talking about - thinking about how not all processes are equal.

Dave [13:52]: I really like that idea. It's an "it depends" response, which is - go look at your processes and start categorizing them. I think DevOps is a great example. There was loads of work that went into DevOps. Much of that work was about eliminating variation in processes and getting to the point where it was consistent, repeatable, really stable, robust ways of deploying features and so on. I see that category of processes.

There are also other categories of process. We talked a little bit about the marketing ones - some marketing messages at least - where we might actually want to explore the edges rather than get into this sort of consistent, stable, repeatable center. So understanding the types of processes at least begins to define how we might approach it. And then I point to everything that you were talking about - the variability. How do you really explore that? How do you capture it? Again, I think, like process mapping, we don't really think about the variability necessarily.

Peter [14:56]: Well, it's kind of interesting, right? Because we're sort of drifting off summaries and things here. We're going back again, because AI is perfectly capable of ensuring that you get the right responses. It's just that you can't always rely on it to do so - just as you wouldn't rely on having the human be the one who's sitting there doing that. You would have the computer system do it.

What you want the AI to do is say, "I'm not going to be able to answer this, I'm going to send it to the computer," which is exactly what it would do. If it sees that your intent is to solve a math problem, it will call a math library, a programming library, and solve it that way. You need it to behave in that manner, and it will generally.

Dave [15:43]: I think that's a reasonable summation. I think we've brought in enough ambiguity into our conversation here to show that this isn't AI-generated. I think that's the takeaway from that.

Peter [15:53]: I hope our listeners find this valuable. It's a fascinating topic and I'm always happy to discuss it. If anybody would like to reach out, they can at feedback@definitelymaybeagile.com. Look forward to next time, Dave.

Dave [16:05]: Until next time. Thanks again!

Peter [16:07]: You've been listening to Definitely Maybe Agile, the podcast where your hosts Peter Maddison and David Sharrock focus on the art and science of digital, agile, and DevOps at scale.

Dave Sharrock

Host

Peter Maddison

Host