Definitely, Maybe Agile

AI Agent Governance in Production with Logan Kelly

Peter Maddison and Dave Sharrock Season 3 Episode 212

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 28:20

Most organizations are somewhere between experimenting with AI agents and quietly hoping nothing breaks in production. Logan Kelly, CEO of Waxle AI, has spent a lot of time in that gap, and he thinks governance is the piece most teams are walking past too quickly.

In this episode, Logan joins Peter and Dave to talk about what agentic governance actually looks like in practice, why a single consistent layer beats a pile of point solutions, and how to keep developers moving fast without letting things go sideways when it counts.

This week's takeaways:

  • Let your teams experiment. That's how you learn what agents can actually do. Just don't skip governance on the way to production.
  • Governance doesn't have to be a gate. The best version layers in without friction, and gives everyone in the organization visibility, not just the dev team.
  • If a developer has to do extra work to implement a governance feature, that's a design problem. Good governance should work for the developer, not the other way around.

Welcome And Introductions

Peter 0:04 Welcome to Definitely Maybe Agile, the podcast where Peter Maddison and Dave Sharrock discuss the complexities of adopting new ways of working at scale. Hello and welcome in the game, everybody. So uh I'm joined today by Dave as always. And uh today we have Logan Kelly joining us too. So hey Logan, why don't you tell us a little bit about yourself?

Logan Kelly 0:24 Cool, yeah. Uh thanks for having me. I am the CEO of Waxle AI, uh and a Gentec governance platform, uh really passionate about creating kind of a Gentec infrastructure that can do real work uh that you know stays in bounds uh in enterprise environments.

Why AI Governance Suddenly Matters

Peter 0:44 When you think of the governance space around AI, I mean obviously AI and agents are all the rage these days. And uh when we think of that governance space, it isn't one we hear as much about. Uh at least I I I guess people are somewhat worried about the costs, at least the CO in most organizations is, because uh if these things go wild, they can uh you can end up with a massive uh charge, which not unlike the early days of cloud, where people would spend far too much money, they we run into uh similar sorts of problems. So when you think of the governance space, uh how do you approach talking to organizations about this?

One Governance Layer Beats Point Tools

Logan Kelly 1:22 Yeah, it's interesting. I feel like uh AI governance is gonna be the buzzword of at least the second half of 2026. Everybody's gonna have a governance solution. Uh I already see it on you know Google. Like every day there's a new new kind of entrant. But it's it's interesting. Governance means a lot of things to to a lot of different people. And you know, in a compliance-oriented organization, governance is one thing in a uh, you know, more like these cloud companies where you know maybe they're deploying tons and tons of agents, like cost becomes a thing. And I think the way the way I see governance is none of that is like separate because the failure mode in the the kind of agentic governance means if you don't catch it, you're kind of looking at you know, potentially customer conversations that went sideways, that you're only looking at the transcript after it happened. You're like maybe emailing anthropic to to maybe get a break on the you know the $5,000 that you spent because this agent went crazy. Um, and so the way we really talk about it is uh, you know, uh a layer in between your agents and the actual execution. And I think that's that makes the most sense to people, not necessarily trying to add point solutions like cost control over here and PII over there, right? It's it's one consistent layer um that you can start to plug in the different failure modes that you know you might have.

Dave 2:58 And can you just talk talk a little? I'm just as you're describing that, thinking that that means that the the scope of work that the agents are doing is well understood and sort of outlined very, very clearly so that that layer can police it effectively. I mean, you're talking about controlling what that agent's able to do, I believe.

Logan Kelly 3:18 Yeah. The way that we've approached governance, because uh where this came from is I the other company that me and my team have built is actually like an AI, like an autonomous, fully autonomous AI kind of SDR in that in that sales engagement space. And so the way we approached governance was first things that we have to look at are cost and quality, right? But all of the other failure modes and all of the other kind of needs for governance still exist. And so I think if you ask a company kind of to say understand every single thing that their agent could or or shouldn't do right out the gate, I think that's that's too big, right? And so it's really let's observe what's going on and then let's put in place the kind of the first couple of policies. And then from there you start to, you know, you start to look, oh, I need to, you know, check the grounding on this, you know, rag-based uh sub-agent and those kinds of things. And so I think that's we've built the the kind of interfaces to be able to see that and then over time have like operational people implementing policies as opposed to engineers running around with their hair on fire.

Observe First Then Add Policies

Peter 4:28 Yeah, with everybody trying different things and experimenting all over the place, uh having some kind of consistent layer. Uh, do you think of it as a as an authentication layer then? Essentially who's allowed to do what to where?

Logan Kelly 4:40 Yeah, I think uh like agentic identity management is uh uh kind of a crazy, a crazy space. And so we look at when when you think about agentic identity, you have like two sides. So like the authentication and the data access, right? And then you also have uh the other side, which is like, is the agent trying to impersonate somebody or is the agent trying to, you know, and so so you have both like is this agent allowed to access this data, which can be like are they uh kind of acting on behalf of a particular user? And then the other side is who is that agent representing? What are they representing to an end user if it's like a chat uh based, you know, interface? And so I think identity means a few different things in the agentic space that I think we're only beginning to understand how to how to kind of control that.

Peter 5:35 Yeah, I think we're uh we're still trying to figure out quite a lot of things in that space as we think about how these things will be.

Logan Kelly 5:43 Yeah, yeah, yeah, yeah. It's amazing. Uh and that's that's really where uh you know the the whole idea of like trying to scope out everything all you know before you implement you know governance is I think it's impossible, right? Because you know, functionality developers can build so quickly, and then all of the other stuff has to catch up with it, you know.

Agent Identity And Impersonation Risk

Dave 6:07 So is there any sort of um principles that you use to keep up with that pace of change? So as you said, the one of the things that we have discussed on this podcast before, but we've also seen, I think both of us, is is the um the pace of change very quickly outpaces any organization's ability to control if those controls aren't somehow you know designed with that view. Um so anything, of course, that includes a human in the loop is automatically going to be a slow gate, but there's also just thinking about how the problems are going to come together, what's going to happen when people are really given somewhat of a free reign as to what they can build.

Auto Instrumentation Over Manual Checkpoints

Logan Kelly 6:47 Yeah, yeah. So our our kind of uh fundamental belief is first you need to observe, so you need to have a good idea of what's going on inside of each of the agents. And then the the second thing uh that we believe is is auto instrumentation as opposed to asking developers to continuously implement you know these kind of checkpoints and sending data to a control plane before it can come back, right? And so the way we see the world is uh it's gotta be really easy to see what's going on. So if there's something you know that that comes up single pane of glass, you could see everything. And the second thing is reducing the amount of work that a developer has to do. And so then what we see is developers should build agents intent first, operations and analysts and product people and finance people then need to have access to be able to see that. So to add something to your to the philosophy, I think it also means that the entire organization needs to have access to governance, not just the development team.

Peter 7:57 That makes sense. That's so in the the intent piece, especially is interesting because I think one of the things that we're seeing is that a lot of what the governance ends up being is we've got a set of intent. And you're using intent for what the agent's doing, but the intent of what we want the outcome to be, being well enough defined up front, and then saying, okay, does the outcome we got the thing that we intended to get? So do the two things match.

Logan Kelly 8:22 Yeah, yeah. Yeah, that's exactly we we when we talk about intent, we talk about what does a developer want, right? And then you get then you can kind of get like you know, DAG, uh, you know, the the DAG graphs and context graphs and all this kind of stuff. And so you can so we we see our kind of governance layer is super important in the the kind of development of the agent as well. So you have that visibility to be able to kind of put like what you were talking about, the kind of what's the intent of the developer? What's the what's the actual outcome? And those mismatches a lot of times go into production, and that's terrifying.

Peter 9:00 Yeah, you you need some form of um it's like fail-closed gate there at the end, something that says basically, hey, this we've we've got this intent, we got this outcome from the execution of the agents, now we gotta make a decision. Uh, is this something that has sufficiently passed our criteria that we feel okay with it going forward? Or is there enough here that maybe we should stop and pause a moment?

Intent Versus Outcome For Control

Logan Kelly 9:24 Yeah, and where that gets really kind of where that can kind of compound or or really go off track is when you start to think about multi-agent pipelines, agents that can delegate to multi multiple agents that run asynchronously, right? So one kind of poisoned piece of the context could poison, you know, uh a down have a downstream effect that you know becomes uh like a multiplier. And so, yeah, like having that's where like scoping we look at scoping policies to like specific agents, delegation, all that kind of stuff. And once again, those are things that until you really see what this looks like on paper, you know, the actual execution, it's impossible to kind of govern, you know, uh the whole system uh from the start.

Dave 10:11 Sounds a lot to me like we've we've talked a lot in in the past about just product delivery and the importance of validating outcomes that you get when some new change goes live, but also validating adoption, that that it's meeting the needs and so on. And that's always been something that I'd say the last 20% of organizations, like really the top, top performers, deliver on that, whereas a lot of the others have always sort of kind of said it's good enough, or you know, we thought this out properly at the beginning, and now it's gone out of the door. And of course, with the agentic perspectives, that AI accelerates the pace of change so incredibly before the complexity of sort of um, you know, a pipeline of different agents working with one another, so that the problem is still the same, that validation of outcomes and adoption, but we don't have the skill set in most organizations. Do you see that being is that um a skill that organize or capability that organizations have when you're working with them, or is that something you're actually having to put in place?

Multi Agent Pipelines Multiply Failures

Logan Kelly 11:15 I think we're in a time uh where like we've built our like documentation and kind of help docs to be accessible both by MCP as well as building them specific for things like cursor and cloud code and all that kind of stuff. I feel like we're in a time where you know the the kind of acceleration of agents is also created an acceleration of coding agents and all this kind of stuff. And so I I think what we're seeing is the best organizations are the ones thinking, not in terms of do I have humans who have the skill set to do this thing? They're saying, do I have humans who have the skill set to to kind of leverage agenda coding and this kind of stuff? And and I'll tell you, you know, I'm sure you guys have worked with ChatGPT and stuff and and you know, lots of AI tools. Uh, it's like the better you get at asking the questions, the more you learn, the faster you learn. And that's like really where competencies and organizations, I think, are getting filled in by people you, you know, two years ago you didn't think would be able to do that job. Um, and so to answer your question, when people start to see the data, when you start to see the kind of what happened, then I think it gets pretty easy not to think through it yourself, but to say, hey, here's my code, here's what happened, what's the, you know, what's the kind of things that I can do? Um, and so I think if organizations have that competency of I know how to leverage AI tools, that gap disappears very quickly.

Peter 12:46 I think some of that often comes down to the amount of trust that the uh the people who are playing with this have. And as you say, that that builds up with time too, is the the more they play with, the more they trust that what they're getting back is actually going to deliver. And I think we also see that happening more broadly in the industry as well.

Logan Kelly 13:06 Yeah. Yeah, I think using AI tools is its own skill. You know, I it's funny, I like get on with real smart people, and they think that AI is just gonna get you to the end result. AI is you're gonna have to have it refactor your code or rewrite your your docs like five or six times, right? And so yeah, it's like the better the more people realize that it's not, you know, you're not gonna get single shot perfect stuff, the better off we are.

Peter 13:36 Yeah, it's uh I was uh playing with a uh new version of a couple of pages on our website this morning and uh uh having a lot of fun with that very problem. But uh yeah, yeah.

Dave 13:48 Yeah, yeah. As you're describing some of these, well, I'm just trying to think of uh what is the most common scenario that you're seeing when you go and talk to an organization that's thinking about governance. And maybe part of that is where is the call for getting governance put in place coming from?

Building Skills With AI Coding Tools

Logan Kelly 14:07 That I think depends on on kind of what the failure modes that the companies are the most worried about, right? The the earliest thing is cost, right? But I think cost of cost control is gonna be a cliche, right? If you're not controlling your costs, you're just irresponsible, right? I think like there's gonna be a point where it's not, oh, AI went crazy. It's like, yeah, it's a tool you're using, right? You like you you get a hammer if you smash your hand, that's your fault, right? But I think then it comes down to a couple of the big places are um if we're using code-based agents, whether they're executing code in a coding agent or they're executing code to kind of debug things in in real time, like with some level of autonomy. There, there's kind of the how are we, how are we ensuring that that those code executions are not are not kind of at the going to directories that they shouldn't be in. And it's funny, things like Cloud Cowork, for example, it's executing a lot of code to like traverse the directories and all this kind of stuff. So like it's not just code execution isn't one of those things that's a concern just for developers. It's a concern for these agents that are constructed that might be and you know used by you know more of the business folk. And then one of the other kind of big things is like PII leakage, right? Or or uh IP leakage into prompts, and then all of a sudden you have this memory and you have this observability logging that's not redacting the credit card number that you know one of the users copied and pasted, right? And so that kind of like are we are we able to audit something? So are we you know logging it, but then also what's actually in those logs? And should there be stuff that you know nobody should be able to see after the after the execution?

Peter 16:00 Yeah, those are all those are all really good points. It's the there's all sorts of wonderful, fun new failure modes that we start to uh encounter with some of these things that uh we have you been running into situations too where um other parts of the system start to fail as a consequence of uh Dave, people have come in and they've introduced say AI into the coding practices, and then the common one is everyone's talking about like, well, pull requests go out the window because why are you looking at the code anyway? How are we then how are we then validating from a governance perspective? And which I think may take us back to that intent and outcome piece, but what's your perspective on that?

Common Triggers Cost PII IP Leakage

Logan Kelly 16:41 So uh I think there's a lot of different ways that you can go there. I think one of the if we look at like code quality or or we look at other parts of the system failing, right, that might be connected to an agent, one of the most dangerous things with that is like if you have an agent start to retry things or you have these iterative loops that that can't exit, and you know, that could be like a token burning thing. And I think that all comes down to if other parts of a system are causing an agent to have errors, those errors can be quickly compounded, whatever they might be. They could be quickly compounded. And so that's like part of governance is how often, how many times do you let an agent try to retry something? How, you know, what's the timeout, right? These kinds of things are in kind of cloud SaaS, like an API endpoint fails, the user might try it a couple more times and then they're gonna go away. With an agent, though, that could be you know hammering your your back-end server thousands of times before you realize what's going on. And so, yeah, I think that's part of the kind of harness of governance is like, but I think that's where uh having the ability to kind of tighten up the harness to you know make harden up the agent to external kind of connected sources failing, whether that's internal or external, um, is a huge piece of governance that can really save costs, save you know, servers, you know, expanding when they don't need to, et cetera.

Dave 18:14 And I think that really talks to what you were saying earlier on about um AI agents not being owned in development. I mean, they're coming, they're they're everywhere as an organization gets comfortable with this. A, there's going to be agents being built at anybody who's got a desktop and has that some form of access to something that helped them build it. But also, presumably, even you know, replacing agents within an agentic flow with something that's been cooked up on somebody's desk somewhere just to get that right for them. So these these introduce a lot of room for uh misinterpretation and so on. So you're describing, I'm I'm just trying to imagine that role of a desktop developer somewhere in the organization that's not really a developer or wasn't until two weeks ago. What hoops do you expect them to be able to go through? Is that centralizing the agenc systems that they're using for create the agents themselves? Is it some other mechanism? How are they kind of able to integrate into a system that's not going to harm the organization?

Retry Loops Timeouts And Server Hammering

Logan Kelly 19:18 Yeah, I think I think that's where uh you have this cascading kind of idea of governance, right? So, you know, if there's a say a developer's using it, I think in any governance, one of the one of the problems is when people operate outside of the governance plane, right? And you know, you see this with access on on you know virtual machines. Somebody just logs into their email from their you know personal computer or something. That you know, nothing, nothing the sysadmin can do about that. So so I think where where the kind of ideal world goes is like you have this kind of say say it's cloud code and you're you have an enterprise that's using cloud code, so cloud code gets instrumented with with Waxel, that becomes a central or you know, or whatever governance you're using. So now when the you know agent is being developed, part of the quality governance is is this instrumented with our governance tools? So now that's where like auto instrumentation comes in, and you say, okay, you know, this person has never developed an agent in their life before, the three lines of code that auto-instrument the thing, now at least we have some visibility. Maybe it's not as granular as uh you know a production agent, right? But maybe it has enough surface area to manage file access and and uh you know token costs and all that kind of stuff. So I think it it's really uh kind of federating the the first layer of governance, which is where are these agents being created? And then from there, if that governance has the you know that the kind of auto instrumentation capability, then I think it gets a lot easier to downstream and sure that you know at least nightmares are not happening uh without people seeing it.

Dave 21:08 And it sounds to me like you're not putting too much of a barrier to exploration, which is of course one of the other things is anytime I have to jump through lots of hoops to try something, I'm either not going to try or I'm going to work around it.

Logan Kelly 21:20 Yeah. I think the worst thing you can do in governance is, especially in the agentic world, when there's so much cool stuff that you could move on to, is you know, make it so a developer has to keep on adding stuff because a developer is not going to, and then a you know, kind of an end user who's like messing around with an agent to make their job easier is definitely not. We we always ask ourselves as we're building this, is like, is our SDK working for the developer, or are we making the developer work for the SDK? And if the answer is developer has to do stuff, that's probably a place that is going to be a governance gap.

Governance For Non Developer Agent Builders

Peter 21:56 How do you deal with the uh security side on these pieces? Where we've got things like uh vulnerability management and app sec, and we're gonna do scanning and it we're not gonna allow it past this gate. Uh when you think of the like we've been talking a lot about like the governance of the agent pieces, but often they have to flow through other systems as a part of that. Do you do the do you incorporate that into, hey, this was done right as a part of the governance?

Logan Kelly 22:21 So like uh so like a pre-deployment uh checking of the of the agent. Um is that what you mean? Yeah. Yeah, yeah. So uh we we have a uh like a CLI tool um that can, you know, and we do this to all of the agents that we build internally too, where part of our our actual deployment pipeline is validating, you know, things like you know uh loops that aren't going to operate infinitely, that don't have policies as a as a part of it, that kind of stuff. So we're not like Snipe and we're not you know gonna go in and say all Langchain has a vulnerability. What we're looking for is are these instrumented properly so that you know terrible things won't happen? And we've found a lot of patterns and we're constantly obviously building different agents using different frameworks to pick up on those patterns that we can at least warn people in those CI CD pipelines that, hey, this is something you should look at maybe before it goes into production.

Peter 23:21 That makes sense. Any more questions, Dave?

Dave 23:24 I was just gonna ask Logan, what what is it that your the perfect problem comes to the table if you get onto a call later on today and it's just like this is exactly where we're most excited to help out? Is it anything specific you're looking for? Because we've talked a lot about just a range of different things around agents and using the coding tools and and a lot of that, but you're probably at the cutting edge in a couple of key areas. I'm just wondering what you're looking for. What's the nice, exciting project that you want to get your hands on?

CI CD Checks For Safer Agents

Logan Kelly 23:54 The exciting project for us, I think is it comes down to two two phases. First phase is is the company out of the kind of experiment experimentation phase? Um, and the second is is it is their high stakes in their agent going wrong, right? That could be, you know, things like data, you know, kind of data leakage, that could be quality, but it's these places where people are really kind of intense of like our agent cannot do this, right? And those are the developers and the engineering teams and the operations teams that when we talk to them, you know, it's like night and day, right? Oh, this engineer has spent so much time trying to build the guardrails at the agent level, right? But things like, you know, but if you could just put the harness around it in three lines of code, we can blow people's minds, right? And so I think that's really where that kind of high stakes we can't have this happen. That's where we can can kind of show our our magic the easiest.

Dave 24:54 And could you define what getting out of the experimentation phase? That was your first need, right?

Logan Kelly 24:59 So yeah, I I think it's you know, you you might be ready to, you know, the the kind of internal, you know, kind of organization has said, yeah, this this agent is is doing the stuff that we needed to do. Um you might already have it in production and then start to start to see some of this stuff trickle in, or you might be, yeah, we could launch this next week, right? And those are the those are the kind of that kind of phase, or it's completely in production. Um and then you know, we definitely need to get you know this this going. But kind of on the other side of the like, can we make it do this thing, right? Because that's where you don't really want governance kind of in the way. You want, you know, to to be able to run and and and have fun with the building agents, which is the coolest thing ever.

Dave 25:46 So it's that sort of innovation phase you don't want to get in the way there. But as soon as they're lining up production, it's like, okay, let's make sure, make sure it works.

Peter 25:55 Awesome. Well, uh, I think we're getting to the point of the day where we normally wrap up these conversations. So uh thank you so much for all the uh input, uh Logan. So normally at this point for our listeners, we uh like to sum everything up with uh three points for them to take away. And uh, as our guest, Logan, I'll let you go first. So what would you like our listeners to take away with them today from our conversation?

Best Fit Customers And Final Takeaways

Logan Kelly 26:17 Yeah, I think we'll go back to the you know what we were just talking about. I think agents should be you know innovated and you know and had fun with uh in an organization, see that they can solve so many different problems, but not to walk past governance uh before you before you launch the agent in production.

Dave 26:36 Dave, you're gonna leave me with one there as well. So Peter, yeah, um Logan, thanks for the conversation. I really um picked a few things up. One of the key takeaways, I think, from my side is part of my head still sits where governance is painful. It's it's the thing that takes the fun out of the room in some ways, which I recognize is a very old um attitude, perhaps. But I think in the in it's also in agency flows and AI in general, it can get the hackles up of the people who are responsible for it. So what I've really appreciated about the conversation is the sort of ease with which it's it's kind of layering into the work that has to be done, which uh is, I think, you know, a great reminder that this can be a conversation. It's not a uh a halt or a gate or anything like this. It's just how on earth are we going to make sure that this is safe when it when it's in production.

Peter 27:30 Well, I think uh think for mine. I liked uh your discussion around uh you look at every new feature you're introducing as uh I think it's kind of built on what Dave was just saying as being how is this going to impact the developer? If the developer's gonna have to do work to implement this, then it's probably not the right way of implementing it. Or this there may be a we need to go back to the drawing board and think about what is it we're trying to do here. So I I do think uh that's a very good piece around uh governance for sure.

Logan Kelly 27:57 This is great conversation, guys. Appreciate the time.

Peter 28:00 Yeah, yeah. Awesome. Thank you both, as always, and uh I look forward to next time. Awesome, thank you. Thanks a lot, Logan. Thanks, Peter. You've been listening to Definitely Maybe Agile, the podcast where your hosts, Peter Maddison and Dave Sharrock, focus on the art and science of digital, agile, and DevOps at scale.