LLM codegen fails and how to stop 'em — Danilo Campos, PostHog

Building reliable autonomous coding agents, like the "PostHog wizard" for software integration, requires specific strategies to overcome common challenges inherent in large language models.
Problems such as outdated model knowledge ("model rot"), generating inconsistent or sub-optimal code, and security risks can be mitigated through structured input and controlled tool usage.
Ultimately, guiding agents effectively relies more on providing fresh, well-sequenced "plain text prose" (documentation, examples) and continuous self-interrogation than on complex code scaffolding.

Address Model Rot with Fresh Context: Combat outdated model knowledge by actively feeding agents fresh, up-to-date documentation (e.g., Markdown files) directly into their context window for immediate reference.
Guide Architectural Decisions with "Model Airplanes": Prevent agents from making "weird architectural decisions" by providing lightweight, exemplary project snippets ("model airplanes") that showcase the ideal integration patterns and shapes in a token-efficient manner.
Limit Improvisation by "Breadcrumbing": Guide agents through tasks by providing a sequence of small, incremental instructions, starting with broad questions and gradually narrowing focus, rather than giving a single upfront command, to ensure consistent outcomes.
Implement Inference-Time Interrogation: At the end of each agent run, ask the agent what could have been done better to improve its success. This "robot user research" helps identify issues like contradictory instructions, missing tools, or incorrect context.
Enforce Fine-Grained Tool Security: For sensitive operations, such as handling .ENV files, restrict direct agent access. Instead, build specialized, minimal tools that only allow specific, safe actions (e.g., checking for a key's presence or writing a new value to an existing key) to prevent data leaks or destructive actions.
Prioritize Prose Over Code for Agent Directives: Recognize that well-structured plain text prose (documentation, instructions) is a highly valuable asset for agents, as it remains effective and scales well even as new, more capable models emerge. Focus on sequencing and clarity of information rather than over-scaffolding agent behavior with code.

PostHog wizard — An autonomous coding agent specifically designed to automate the integration of PostHog analytics into various software projects. Model rot — The degradation of an AI model's performance or relevance over time as the real-world data it was trained on becomes outdated or irrelevant. Context window — The limited amount of information or text that a large language model can process and "remember" during a single interaction or generation. RAGs (Retrieval Augmented Generation) — A technique for LLMs to retrieve relevant information from an external knowledge base to improve the accuracy and recency of their generated responses. Autonomous coding agents — AI-driven systems capable of independently understanding, planning, and executing coding tasks, such as generating, modifying, or integrating code. Model airplanes — Lightweight, exemplary code projects or snippets that demonstrate correct architectural patterns or integration methods for an AI agent to learn from. Breadcrumb the agent — A strategy of guiding an AI agent through a complex task by providing a series of small, sequential instructions or hints, rather than a single comprehensive command. Inference-time interrogation — The process of prompting an AI agent to provide feedback on its own performance or challenges immediately after completing a task. Tool usage (in LLM context) — The ability of an AI agent to select and interact with external software tools, APIs, or functions to perform specific actions or access information. Skill files — Curated collections of documentation, example code, and specific instructions that provide an AI agent with the necessary knowledge and capabilities to perform a task. LLM gateway — An intermediary service that manages requests to large language models, often handling token usage, routing, logging, and security.

Good morning. Who's afraid of robots? Afraid of robots. I'm not afraid of robots because they have already bloodied my nose so many times there's no more pain that they can give to me. And that is what I want to tell you about on this fine morning. Thanks for coming to hang with me. So my name is Danilo. I work at post-hog and I make the post-hog wizard. And the very strange thing that the post-hog wizard does is it skips two hours of misery that you will never get back in your life and it hands it back to you as eight minutes of pseudo-entertainment. Now, how do we get away with this? We're talking 15,000 people every single month run this wizard and in exchange for their trouble they get a post-hog integration that works and that they actually like. How do we do it? I'm going to tell you all about it today and just to underscore the point that this actually works from the last six hours we get two unprompted posts on Blue Sky and Twitter where people are actually happy. Now, this should be terrifying. I got a robot out there, it's writing code for people. What if it's doing a bad job? Well, we learned all the ways that it could do a bad job. I'm going to tell you the ways that those bad jobs happen. I'm going to tell you some strategies that you can use so that your autonomous coding agents do the right thing as well. All right. Let's start with the easy one. We've got model rot. Now, training a model takes a lot of time, but it's not even the time, it's the money. You're not screwing around as anthropic training a model on a weekend as a lark. This is a serious capital expense. The trade-off with this is that the models sit there no longer representing reality. They are a snapshot of the world and the web as it was six, eight, twelve, eighteen months ago perhaps. Now, this is useful for many things, but if you're a fast moving software project and there are loads of fast moving software projects, the trade-off of this is that the model doesn't know what the hell is going on anymore. So you got to deal with model rot. Now, this is fairly straightforward stuff. Probably dealt with this sort of thing before. Does anyone here have a conviction about how you deal with model rot? I guess it's. I'm seeing some shaking heads. What's that? Rags is good. Although I'll tell you what. With the context windows being what they are at this point, you can't beat just shoving a bunch of markdown files into the context and patching the holes. And this is exactly what we do with the post hog wizard. Is that we have documentation that is fresh, hot off the presses on post hog dot com. And we allow the agent to make a selection. We say, hey, what are you doing? What are you integrating right here? What have we detected? And the agent can use tools to go out, pick from a menu of fresh hot markdown that it can then just slide right into its context, get the job done, do things correctly. Now, what happened to spur all of this was that a year ago, people started asking, they're very primitive agents. Like, all right, cursor. I want you to integrate post hog for me. And it would do a terrible job. Right? It's just it's making up keys. It is making up patterns. It is inventing APIs that don't exist. And it is not our fault. Like we didn't do anything, but it was our problem. So figuring out ways that we could serve, correct, up to date context to the agent so that it would do the correct job is part of how we get people posting happy about what the wizard did for them. All right. Now, these models, I mean, clearly they've been scraping every kind of project out there. And I don't have to guess that not all of them had great architecture because some of the decisions that these agents make when they're putting a project together, very strange. And so what do you do? How do you deal with the fact that an agent's conception of how to put something together, maybe technically like workable, but not exactly ideal. Well, me and my homies on the post hog wizard team, we maintain a fleet of what we call model airplanes. And these are projects that have post hog implemented in them. They've got them across a bunch of frameworks, a bunch of languages. But what makes it a model airplane is that we don't have an entire proper production application going in there. What we have is something much thinner, something that is a similar a crumb of a real application. But for example, the off doesn't work. Or rather, the off works for anything. You can just put whatever you want in the password field and you're going to be able to log in. But the off is off shaped, which means that we can provide these model airplanes to the agent. And then the agent knows, oh, cool. So when auth shows up, this is a great place to put the particular event tracking. Another event tracking that one would want to use when they wanted to track logins and identity in post. And so through the maintenance of a thing that isn't quite as elaborate as the real production application, which means also, of course, it is more token efficient. What you get is the correct shape of an integration as a pattern that the model and agent are able to complete consistently every time. So in addition to weird architecture, the agent can find a weird path through the problem space. And with 15,000 integrations per month, it might find 15,000 ways to get a post-tog integration done. And while this would satisfy the requirements of we've automated integration, it would leave us with a very strange support burden because we would have too many different ways that post-tog was set up. It's like, what, what the hell is this? How do I make sense of this? This would be a problem at scale. This would be some sorcerers apprentice stuff. So to limit improvisation, what we do is breadcrumb the agent. We don't tell the agent upfront exactly what we're going to do. You know, maybe you've seen this before even when you're using clawed code is that if you tell them exactly where you want to go, it might make a clawed code shaped hole through the first four tasks and then just get really rock polishy with the fifth. And this is not what we want for our case. And so one of the things that we do is we start off barely even telling the agent that this is what we're doing. We don't even mention really that we're doing a post-tog integration. We start with something like where are the files with interesting business value in this project? Can you find something that looks like a login or a stripe interface or something that might indicate someone's about to turn? We go looking for the files that would be responsive to impact in somebody's business. And the funny thing is that business stuff casts a huge shadow in code. And so we can very reliably detect this kind of stuff. Now from there, we say okay, here's some cool files. What are the interesting events going on in those files? Don't write any code right now. Just let's think about some cool events that we might want to sprinkle through here. What might those be? So we make a list of these and get the event names. We get the descriptions for those events and we just tuck them into a little file. And this is the start of things. We don't even know where we're going necessarily. So the next breadcrumb is like okay, let's start to actually implement post-tog. We now know a bunch of events. We've really thought carefully about what those events might be. And now we have documentation and everything which we can load at WIM according to the framework and language that we care about here. And so we can reliably go in there and start to make modifications to people's files. And the modifications are once again not stupid. And they're not mad for it. Okay. Now we can do all of the thoughtful stuff that we can to make the agent successful. But the biggest threat to our agent outcomes is ourselves. We're feeble little beings. We got a little bit of meat right here locked inside of our heads. And we have a context limit too. We can't really quantify it. And it varies by how long ago we had some coffee. And if we had breakfast that morning, our context is not just limited but fragmentary. There's stuff that we remember implementing last week and there's stuff that we forgot from last month. And so we're making changes and we're editing code and we're evolving the stuff that our agents is working around. And sometimes we are dropping things that really matter. And so there is a point where we had an MCP tool instruction that was contradictory to a different tool. And the agent is like, man, I don't know what to do here. You put me into an impossible spot. We had a situation where we were telling it, hey, there's a tool that you definitely need to use to conclude this setup. And the agent's getting there, all right, cool. Let's use the tool. Wait, the MCP does not have a tool by this name. I want to get like hundreds of runs going with this missing tool. And what's going on there. So if we didn't ask, we wouldn't know. And so one of the things that you can do that is really handy and fairly cheap is a little bit of inference time interrogation of what just happened with your agent. At the end of every run right at the stop hook, we ask a very simple question. We're doing a little bit of user research, but the user is in this case a robot. We asked the robot user, hey, what can we have done better to set you up for success in this run. And then it tells us. And that's how we found out. Like, hey, we didn't give you permission to access the tool. And so there was no tool. So we had to use these contradictory directives without this ongoing interrogation. Oh, a good one is that we kept giving it instructions for JavaScript. And it was a Python project that it was working in. Of course, very well not frustrating, but we would identify it that way. Big deal, you have to ask to find out. Now, there's also shenanigans. You got to be concerned about here because running an agent on someone else's machine demands a huge amount of trust, right. We've got this robot that could do anything potentially. And we don't want to do something batter destructive to the user's project. We don't want to put them in a worse spot. And one of the early versions of our wizards would actually just read dot ENV files, which is necessary to do rights. Right. You can't just right blind to a file. It's just one of the mechanics of how, you know, these agents work. And it's also not ideal to be sending people's ENV contents up to a Claude. And it's like, all right, cool. That's sitting in someone's damn log that you don't know about. So this is obviously bad news. But when you're designing these things, you have fine grain control over tool usage. Right. You can decide. All right, these tools are OK. These kinds of reads are OK. These kinds of reads are not OK. So we really locked down what the agent was allowed to do around anything that was an ENV file. And then we were able to build it a tool that could do two things. It could check the presence of a key does this key exist. And it could write to a key a new value. And that was it. There was nothing that was going up in terms of inference for this ENV file. And so as a result, we were no longer touching this stuff. But again, man, you were setting loose these robots on anybody's computer. You got to keep an eye on these shenanigans because even if you're kind of solving the problem you promised you'd solve, doing it in a way makes you look like an asshole. All right. Now, this is the big one. This is the weird one because our whole careers. We have been rewarded by writing the code, write the code, we write the clever code. Oh, I got a structure in here. This thing works. This thing is reliable. This thing is elaborate. But the performance is really good. And we just got a code to shit out of this thing. If we code our way out of this problem, everything's going to be great. All right. That is not the world that we live in anymore. And a very funny thing about code is that if today you have written some code that you think is good. And tomorrow, a new model drops. The code that you wrote has the exact same value. If anything, it might be declining a bit. Right. Code has always been a depreciating asset. You write it. You might ship it into the world a little bit rotten. And you might think, there's got some tech debt and you got to deal with that at some point. But meanwhile, you shipped on time. You got what you had to do. The wizard that makes everybody so happy is 90% markdown files, 8% tools for delivering and processing markdown files. And then the rest is like Agent Harness stuff. Right. Plane text prose is where so much of our value now lives. When you write great prose today and tomorrow and even better model drops, it's going to be able to take that prose and do even more with it. And so an agent is an octopus. Right. It can wriggle. It can squeeze into tight corners. It can maneuver itself around problems. You do not want to overconstraine the agent in its ability to get problems done, aside from the shenanigans stuff as we talked about. So instead of thinking about like, man, how can I scaffold the hell out of the behavior of this agent? It's about saying, how do I step back? How do I give it enough information? And how do I sequence the information that I give it so that it does the thing that I want it to do. And it makes people happy in the process. So this is what I know from the robot, blooding my nose. See how my clock here. I got a couple minutes left. Does anyone have questions about the strange adventure of building this robot that makes people happy? Shoot. Yeah. Oh, sure. So the way that we drive context for the wizard is we use skill files that are generated from our context service. And so that context service is going to take all of those model airplanes, flatten them into a single markdown file, and then include them as a reference in the skill file. And so we always have access to the full model airplane, which the model can grip and otherwise turn through. Oh, sure. So yeah, this is just part of the supplemental content that is included in the skill. And so what we found was there was a range of useful input that we could include as part of the skill file. So we've got documentation, which is plain text prose, but then we also include the model airplanes that it can see the shape of a successful integration. And it references all of that as part of getting the job done. Shoot. Oh, sure. So this uses the Claude agent SDK, which we then wrap inside of a CLI. And so you just run a single command. And then we give you free inference by logging into post logs. We've got this LLM gateway where we can cover all of the tokens on your behalf, which was a whole zoo because sometimes Claude would store off information in a place that we weren't expecting. And then it would just break for people. It's early days for doing this as kind of a service. And then else I can tell you, then I'm going to scoot out of the next speaker's way. Thank you for hanging out. It's great to see you. Have yourself a great rest of your day.

LLM codegen fails and how to stop 'em — Danilo Campos, PostHog

TL;DR

Takeaways

Vocabulary

Transcript