Lessons on AI agents from Claude Plays Pokemon

The AI world has talked so much about agents in the last year. But for a lot of the world, I think it's pretty hard to really understand what that means. And I think the example of Pokemon where it's like, oh, it's not just a chat bot where I type in a chat and I get a response back. But it's doing this on its own and trying things and taking actions. And all that is a way that I think more people have been able to latch onto what is this agents thing we're talking about. Take one. Today we're diving into the story behind Claude Place-Pokémon. I'm Alex. I lead Claude Relations here at Anthropic. And I'm David. I'm on our Applied AI team and I made Claude Place-Pokémon. David, can we get a broad overview of what is Claude Place-Pokémon for those who aren't aware? Claude Place-Pokémon is a sort of experiment with agents that is hooking up Claude, our language model, to the Game Boy Game Pokemon Red, letting it actually try to play the game. And so we started from square one, like, start a new game and see how Claude does learning how to be a Pokemon Master. Okay, there's a lot there that we're going to have to get into on the technical side. And also, just generally, how is Claude playing Pokemon? How does it know how to play? But how did this actually come about? What was the idea for this and why Pokemon? Yeah. So the core genesis of like why I started this was a lot of, I work with customers at Anthropic. I spent a lot of my day working with our customers. And it was pretty obvious to me last year that like, agents was the most important thing that was happening for our customers. It's like where they were making value. And I just wanted some sort of test bed for myself to experiment with agents, to get a feel for how does Claude work when it needs to take a whole bunch of actions in a row without any conversation with a human. I actually had seen someone before me had hooked up Claude to Pokemon at Anthropic, Elliott. And so that was sitting in the back of my brain. And then I thought like I really want some sort of place for me to do my own experimentation with agents. I am like a Pokemon long-term fanboy. It was the first game I ever got when I was a kid. That was really like the initial, and this was like June of 2024 last year, where we had a new model coming out 3.5 on it. It just seemed like the perfect time to give it a go. Why did you specifically, besides that nostalgia component, why choose Pokemon as like the test bed? Why not like Claude plays Mario or Zelda or any other game? Pokemon is actually like a really good setup for this. Language models for all their graces right now, they're like slow. They take one generation at a time. They basically can only see like a snapshot of the game at any point in time. And Pokemon, there's like essentially no consequences for sitting and waiting for a while. The combat is turn-based, so you just have to press a button and wait to see what happens. Not much happens when you're not moving. So it's like in terms of games for a model to be able to play, it's sort of like perfectly set up. And then separately like why games in the first place is also like a pretty reasonable question to get at. And one of the fun things about games is they're like one of these unique things where you do something for a really long period of time and get pretty clear feedback like am I making progress. So whether it's a score counter and pong that shows you like are you beating your opponent and pong or in Pokemon like am I getting gym badges? Am I making my way through the game? You actually get some like feedback loop where you can see like is the model actually being successful? I can't it can't it do this thing? And so games just like happen to be this great environment where you can let a model try to do something for a long time and actually like by the structure of the games themselves get a sense of like how well is it doing in some measurable way? Right. Okay, so to summarize that basically Pokemon's great because it's turn-based. It's not synchronous necessarily. You're not playing against other people. So like we couldn't have Claude plays like Call of Duty or something that just not work right now. It would be really hard. It clouded like if you could get Claude-writing at 60 frames a second. Right. Like maybe you could play some other game but practically right now. Yeah, exactly right. Right. And then games are great because they're kind of their own simulated contained environment. Yeah. Let's exactly go through the entire process. Yeah. We have Claude playing Pokemon. You've started to work on this but how does that actually work? How's Claude I was called controlling the character in the game and actually making things happen? I guess in general like you can think of the prompt for a Claude to start is basically like you're playing Pokemon. It's really simple. I just tell the model it's playing Pokemon and that's it. And then you hook up Claude with a set of tools that it has to be able to actually interact with the game. And you know it's actually a pretty simple set of tools as the minimal set you get in a Game Boy game. It's like pressing buttons on the Game Boy. So basically you tell Claude it's allowed to press A, press B, hit up down left right that kind of thing. You describe that tool do it behind the scenes like I have to implement some code that actually says like when Claude says it wants to press A I need to make it so that that actually goes press the day on the emulator. I see. So the tools are like options for next action. Exactly. You can take and then you get the output from Claude and then translate it into the action. Bingo. So the whole game is essentially Claude taking a series of actions mostly just pressing buttons one at a time. Now the key feedback loop and how it does this is every time it presses a button we send it back a screenshot. So Claude actually sees the screen of the Game Boy. You can see like I'm in this room or I'm in a Pokemon battle or whatever it is. And so the next thing I want to do is press A to talk to this person or to select this move or whatever it is. And then it kind of just does that forever. You know it or it's on that. And now practically we've actually given it a handful of other tools that help it do things like manage its memory over a long period of time. Yeah. Claude has a limited amount of context. It can't fit actually like a whole Game Boy roll out like play in one context window. So there's a lot you need to do to actually manage the details of how to get that right. Right. But at its core it's sort of like while true press buttons and see if you could play the game and give it some feedback on the way to know what it's looking at. Yeah. And so as you've been constructing that kind of harness around Claude to get it to play that. Yeah. And you're running into some of those limitations. One of the things you've encountered is like the memory issue. Yeah. Maybe you can speak a little bit on that around you had to do specifically there. Yeah. I was like I'm going to zoom out of that question one second first, which is like there's all sorts of ways that people have come up with to build these like agents that have to take actions for a long time. So you know there's this like general concept of an agent which is just like the model needs to take some set of actions and we really don't know what it is ahead of time. But the model is going to have to like do something, see what happens, do the next thing learn. And people have come up with all sorts of crazy ways to wire up that thing to try to make models better in any number of situations. And so I actually like the first thing I did was just try other people's ideas basically right. So there's this paper called Voyager that was the first thing I tried. It was I think probably like a 2023 paper from NewVedia that they used to play Minecraft. And that's that was like the first thing I tried. And that gives the model all sorts of I won't dive into it but like a lot of fancy tools and stuff. And I've like sort of like stripped back to my own simple version of it from there. So then like I guess what you start with is this sort of like the world simplest version. All you can do is press buttons and get a screenshot. And the first wall you run into is what happens when you hit 200,000 tokens basically which is the limited Claude. Practically that's like something like taking 50 buttons or something like that and getting a screenshot back fills it up. So you press 50 buttons and you have barely started the game in Pokemon. And so you run into this wall pretty quickly where if you do nothing you just run out of space and then you crash and you're done. And so one of the first things you need to figure out is what do I do when I run out of space. Right. And the two like key insights here of how I went about this at least and I think that it's actually pretty well reflected like in the industry these days about how you think about agents are some concept of long term memory. So I give Claude some ability to basically store a memory in what I call a knowledge base where it just basically says like I just did this thing or I have these Pokemon or here's my goals and I've checked off six of them. Remember whatever it is and you'll see it's sort of like making incremental updates and that persists throughout time like it can constantly always look at that and keep track of something because on the second key is once I fill up the context length I have called summarize back down to remove the 50 actions it just took into one short summary. I say so you delete a whole bunch of stuff and reset but it has this sort of like set of long term memories that give it some ability to sort of like remember things over the course of potentially I mean if you look at the the twitch stream now like it's been running for three weeks continuously it's probably summarized itself a few thousand times by now and so it's really important for it to have some sort of long term memory to remember the last three weeks of what it's done. So the long term memory there it writing out to this external knowledge base was just basically like a plain text file. Yeah. Is like kind of like the movie Momento where they're like writing sticky notes on the wall and everything to track. All right. I've been here. I've done this. Yeah. That's exactly right. Like if you just do it like naively were to delete the message Claude would have MNESHA and I would like how did I get here? Why am I part way through Pokemon? You told me I'm just starting a new game. What's going on? In some sense this is the post it notes the Claude leaves itself so that when it has the reset and you remove a lot of what's recently seen it has some way to remember like all right I know I've done so far I have a sense of where I'm at. Maybe I've even like learned some lessons. Like maybe I've learned some things about like this strategy works really well. I've learned some stuff that like I want to keep around because for the next you know 10 minutes that I have it's going to be really helpful. That makes sense. And now I think it's probably pretty important to clarify here or maybe even explain further. We trained to claw out on this. Like how does it know actually how to navigate around Pokemon? Yeah. I mean you're not really giving it that many instructions. You're not like giving it a how to guide. I know this is part of the fun part. So we haven't trained clawed on Pokemon itself at all. Yeah. Clod obviously knows something about Pokemon. Like if you go to claw.ai and ask it questions like you'll you'll notice that it can recall some like general facts that there's enough in the pop culture that of course clawed knows something about Pokemon. Through the pre-training process. Exactly. Yeah. And it even knows like some of the broad strokes information about the game. So it knows that like the first gym leader is Brock and so it has some sense of like the structure of the game. But the details it really knows nothing about it. In fact sometimes it thinks it knows things and it does it. This is one of the hard things you have to fight with. But that means that even without sort of like specific training. Knowing a little bit about what Pokemon is and some of the motivation of it. It has to figure out everything else sort of in the intermediate space. And so you'll see it like some of the funny things you find out like to see how this actually works in practice is you'll see it talk to an NPC that gives it some tidbit of information like at the very beginning it says Professor Oak is next door you need to go talk to him. And that's what like mom tells the Pokemon character. So Quads mom in Pokemon tells Quads you need to go next door. And then Claude will like very rigorously be like I need to go next door. Like I need to go find Professor Oak next door. And it will be on a mission. And occasionally it'll get over it. So the funny fact about Pokemon Red is it actually lies to you like mom lies to you. Professor Oak is actually not next door. You have to go find a elsewhere. I've actually seen Claude get disrailed on this because it's like my mom told me. It's so trusting. That I mean I can't have to believe mom like to my mom. She told me this is true. And it will get stuck there like really looking next door for Professor Oak for a very long period of time funny enough. Do you really do see over the course of it playing the game that it like by interacting and just seeing the screen seeing what trainer people say, science say just as experience of what happens when to try stuff that it like actually picks up the tidbits that are the core of where it's going. What it should do next that kind of thing. So this is like a pure playing experience. We didn't like give it some pathway to follow. No. Tell it like if you run in this situation do this sort of thing. It's just like really interacting with the game as a human word in that sense. Yeah. And maybe just like on why am I doing this level. My goal is not to beat Pokemon Red. I did that when I was six. Like this is not the the gold standard. And I'm pretty sure you can write a program to beat Pokemon Red if that's your only goal. I wanted to understand like how does Claude work? Like how is Claude at this? Like can Claude handle this situation? And so that end like it would be no fun to just give it the answer guide. Like I want to find out how Claude figures out the answer. Yeah. And so most of how I've structured everything and most of how you like what you actually see when you do this is like a pretty bare bones like Claude has to go figure this out on its own because I'm curious and we're curious like how does Claude even try doing this thing? How does it try to play Pokemon? Right. We're not trying to get Claude to beat Pokemon. We want to just generally evaluate Claude on agenteic tasks and figure out where it's at. Yeah. And I don't think anybody's making their buying decision for a model on which model plays Pokemon the best. So this is really for our own understand. Yeah. So you've been doing this now for a good bit of time. Too long. Yeah. What's it kind of been like over that amount of time with the different models as we've continued to iterate on Claude? Can you just walk us through what that journey is like? So I mentioned I started with 3.5 Sonic. That was the first model I hooked up and it wasn't very good at Pokemon. So in Pokemon you start in the second floor of your house and to get the first bit of progress is like finding the stairs in the top right. And I probably spent like three days of working on this before I got the model to find the stairs in the first room. I remember just being incredibly hype the first time it like got out of the house and then actually like got to the cut scene where you potentially can get your first Pokemon. That was like the pinnacle of my achievement with 3.5 Sonic and I thought I had really like done amazing things back then when I had that happening and that was like better than any of the experiments we'd seen in the past. And then I kind of like tucked it away like this has been a fun project. I learned some stuff but like that's about it. And then we released the refresh of 3.5 Sonic in October and I picked this back up and you can tell it was a little better like like pretty noticeable. So it like very consistently found the stairs amazingly. It pretty consistently with like a relatively predictable time frame would figure out how to get a starter Pokemon. We actually saw it like win the battle the first time. Start moving a little bit the right direction after that. It was really slow. It made a lot of dumb mistakes but like you could see you could squint and see like it had clearly gotten better. A lot of that was just like not getting stuck. Like not thinking the game was bud like having a sense of different strategies and things to try. We were actually like kind of excited about that. Again like wherever Claude is you're just so excited when it gets like one step further that it's very fun to watch and engage with. But it was still like I remember someone asked me back then like how do you just like random button presses doing comparison. And Claude was like this much better than pressing random buttons. So you know like it was better but like not not a lot better than pressing random buttons. And so again I like tucked it away. It had been fun. And then we got to testing 3.7 on it and it was just like way better. And it was pretty obvious. One of the first moments I really realized 3.7 summit was way better. I was going through watching it play. And I realized there was this like terrible bug in my code where I wasn't showing Claude that all of the information it needed to play the game. I had this like thing at the time that was helping like show it a map to try to give it like some extra sense of how to navigate. And 3.7 was already doing way better than 3.6. The new 3.5 did. And I was like oh this is this might be like pretty real. And so I pretty quickly became like deeply obsessed with like we got to find out how good this actually is. And I started like grinding harder than ever on like giving everything. All of the tools the Claude really needed to be successful because you could see it had like the core material it needed to make progress over the handful of weeks that we did testing of it. It was like pretty clear that Claude was quite capable not a star yet. We can get into that I'm sure but it started actually playing the game. It beat a gym leader one day and everybody freaked out. That's like the same thing that people can see on the stream now where it slow and challenging sometimes but it makes meaningful progress in the game. That's fascinating. What do you think that shows actually about like the models improvements themselves? Yeah. Why is it that it's actually getting better at the game? This is one of the fun things is I actually like kind of know a little bit about the models because of this. I think there are a few things that have made a difference over time. Surprisingly the vision which is the hardest thing like when you look at the game like you notice that it the Claude's not very good at understanding Game Boy Screens. That actually hasn't improved a lot. The model has made all this progress and its ability to actually like stare at a Game Boy Screens and see what it's going on is about as bad as it always has been. Then it's like how is it actually making progress if it doesn't have a better fundamentals on what's going on? I think the things that you've noticed or I've noticed the most and it's actually tracks a lot with what we focus on an Anthropic is that Claude is just getting a lot better at coming up with strategies of things that should try. Of questioning the previous strategies it had and thinking maybe the mistake isn't that there's a bug in the game. It's that I had a bad strategy. What is the other strategy I could try? If the last thing I tried didn't work, what should I try next and backtracking to figure out good different things to try. There's a certain tenacity to trying all of the different ways you could try a problem and figuring it out. I think that was the jump that got from 3.5 to the refresh in October. Then that was huge jump in that exact skill. With 3.7, we're now it's just way more willing. Even though it's slow, I think if you watch the twist stream for two hours, you might think there's no way this thing is good at questioning itself. But the amazing thing is after over time it does typically step back and think what's the next thing I should do. That ability to triage and understand different ways to go about solving a problem. Getting better at that is one of the biggest things that I think has improved with our models over time. It's one of the biggest things that enables agents for all sorts of things, not just Pokemon to be really successful. It's the thing that's made pretty okay at playing Pokemon. The fact that it's improved on all these capabilities and it's progressing through the game, how does that actually extrapolate out into other areas where we witness that improvement with maybe more real life. Yeah, use cases. It's funny. At squint value, Pokemon couldn't be more different than writing code or all the other stuff that people do with Claude. But this core thing, the ability to come up with a plan that's good, to try something, to see if it's working or not and adjust, to understand what are the different strategies available to me and to be willing to try them, fail, update what you should be doing with new information. That is the core recipe if I think a lot of what makes agents good in a lot of scenarios. One that we probably spent too much time talking about is coding, but when you are writing code, you write something, you see a test fail and you have to think like, what do I do wrong? How do I do better? What should I try next? What's the next strategy? And models do this all the time. There's one thing between a model that can get perfectly right every time all the time. But sometimes you just don't even have the information you need. You don't know until you run the test that something you miss something. And so the ability to know what test should I run to learn when I find out something, how should I incorporate that, how should I update the strategy? I had to go write this piece of code or whatever. That's the same thing. And I think that's actually like any industry. When you think about how you go about, if I were to go search the internet for something, it's like, I click on something, I notice the page is bad. It doesn't have the fact I need. Or maybe it has like part of it, but I realize because of that, I need to go search something else to get it right. All of that building a strategy and understanding how to incorporate new information over potentially a lot of different actions just has like really broad reaching applications, I think, for how people build things with code. And it just maps really intuitively to how we as people think about solving complex problems. So that planning and then executing loop, doing taking an action, stepping back, re-evaluating, and taking another action. Yeah. I don't think a lot of times we, as people think about that, they're really granular scale. When I'm doing a meaningful thing in my job, I don't necessarily think, I need to re-evaluate my action and try something different. But that's still what is compiled down. That's what's happening with me is I'm getting a piece of information. I am seeing if that means I need to, like, what the next step I need to take is and having sort of continuous feedback loop. And a lot of that is what's going to make Claude a useful colleague assistant, whatever it might be going forward. Yeah, I mean, I guess I can see it in myself sometimes too. And I have like, I have it to-do list. And maybe I wrote it at the start of the day. Yeah. And I go through some meetings. And then all of a sudden I get new information or I take an action. I talk to somebody. And now I have to return to my to-do list. Yeah. Reprioritize, move things around. And it's that same sort of, like, loop that Clouds learning here as well. Yeah, that's exactly right. And you can just notice it getting much better at that. Yeah. And maybe to make it concrete, a thing that would happen in the past is it would write out its to-do list and it would get hyperfixated out of maybe, right? It wouldn't be able to successfully incorporate this new thing you learned. Right. A good example, which you still see some times, but you see a lot less, is like Clouds, like, I need to go to the top left in Pokemon. And then I'll just walk into a wall for hours on it. And if the model is really fixated on walking at the top left, like, yeah, you keep just walking up until I get there. But if it's walking into a wall, like, it eventually needs to step back and be like, maybe I maybe I need to be doing something else. Yes. And that's the kind of thing where that's what happens in Pokemon. But yeah, it's like super translational to a lot of the other things we do. Right. That's a good, um, Parley and to this next question I had around. What are the, what are the current kind of funny moments we've been seeing? I mean, it's not perfect quite yet. You know, it's got a lot better over the past eight or so months. But we haven't beat Pokemon. Yeah. There's definitely been some funny times along the way. Oh yeah. Any stories you could share there? Yeah. Claude is not perfect this yet. I think like, I have a laundry list of the things that I think Claude still needs to get better at. I'll start by harping on some of the things that like, Claude doesn't quite do well yet and that are interesting and they tend to make pretty funny moments. Yeah. One of my favorites was related to its like visual acuity. It doesn't see the screen very well. So I was playing, I was watching it play one time and I went to bed and it had walked into a building and it thought that a door mat in the building was a dialogue box. And you have to press a button to dismiss dialogue. And I woke up the next day and it had spent eight hours pressing that button trying to dismiss the dialogue box. Like not realizing that it was like, it's like cycling through the dialogue and it just had thinking it had to advance. And so there's a couple of things that are like a little bit interesting there to unpack. One is like making a mistake where you become really confident something the dialogue box is a pretty bad mistake. And so if you have like a core understanding issue, that's pretty bad. The other is time like pressing a button 15,000 times to Claude doesn't really mean much. Like to me, if I was spending eight hours pressing buttons, I'd be like, I'm pretty tired of pressing my thumb hurts. For Claude, like 15,000 button presses, like who cares, keep going. It doesn't know how much time is even allowed. Yeah. And so there's like some intuition space around the ability to comprehend how long is too long? Like what is time? That kind of thing that I think can, it's a bit funny. I need some work on. So that's like one general category though, is like seeing another fun story I have that's more on the planning and strategy side. I was watching it play one time and it got into Mountain Moon, which is for the people who watch the stream, like a place where it really likes to get stuck for a long period of time. And it had one Pokemon at the time in this run that I had. And it had the option to learn a new move that was a not attacking move. And you really need attacking, like you don't have an attacking move, you can't beat other Pokemon. And it wanted to learn that move, but it just was like very excited. And so it like pressed a, it pressed a bunch of times to like get clear the dialogue to get to that point. And it accidentally pressed the button too many times and it deleted its only attacking move. Oh no. And so now it's like stuck here in this part of the game with no moves, essentially like nothing to do, like no way to progress. And that's a little bit of like when there are destructive consequences sometimes, like understanding that I need to go slow. Right. Like something could go wrong. Wrong. If I press a 15 times without checking what happened in the middle. And a thing you'll notice in that is that sometimes it like, its intuition is that it will be able to stop itself. Like if I press a, if I want to press a 15 times, like I'll just say press a 15 times. And then if I see something go wrong, I'll just stop because Claude, like it doesn't quite have this intuition that like, I'm a language model. Yeah. I don't have the upper, like there's not actually this tool is not giving me the affordance to stop. I'm not, I'm not checking in the middle. It's going on. Yeah. And so there's a little bit of like understanding about like, and I think about this as like a self awareness of my limitations. The situation that kind of thing is called that it sometimes struggles with. And that's actually really important. Like one of the things I wish the most of Claude was that it had like a slightly better awareness of, hey, maybe I'm not good at sitting the screen. Like maybe the fact that I'm walking into this wall over and over and over again means that like I need to learn something at a meta level about my own capabilities. And I should think about a completely different strategy rather than like walking into the wall. And so there's a little bit of like its ability to sort of like meta learn about its what it is, what its capabilities are that kind of thing. I think it, it could still has a lot of room to get better at. And then maybe one last thing around frustration, one of my favorite stories. So again on Mount Moon, it typically the model takes like two days to get through Mount Moon. It's it's amazing. It's like it takes a long time on the stream. What makes Mount Moon so hard? Yeah. So one of the things called to work that right now is like navigation. Like wandering over a long period of time. It doesn't quite have a good understanding of like spatial awareness in general. I guess I would say. And Mount Moon is like the first place that really stresses where you need to make like navigate a pretty long maze of corridors to get to the right spot. And there's all sorts of like little nooks and crannies to get stuck. And you need to do a pretty nuanced like traverse through a series of paths to get to the exit. And it just takes quite a long time to like actually find all of the past. I'm going to sense where it is and where it shouldn't go and that kind of thing. So one time I was, in fact the first time it ever got there, when you go through Mount Moon, there's this like last thing you do, which is you have to get a fossil in Mount Moon. And once you get the fossil, you're like 15 steps from the exit from finally getting out the other side. And so the first time I was ever watching caught in Mount Moon, it goes through like three days gets the fossil. And it's like for me, I'm like, this is it. It's finally happening. I didn't, I didn't think this is ever going to happen. I thought this was hopeless. I was like ready to write this off. We were going to publish that benchmark we had and this was going to be the end of it. That's fine. We got one bad at Mount Moon. We're excited. And so I'm at like peak hype. Like we're going to get out. This is it. We could keep going at the game, right? And then it proceeded to get lost. It's 15 steps away. It turns around and goes the other direction. It's lost. And then it uses this item called an escape rope, which teleports you back to the last place you rested, which is like outside the beginning of the cave. So it spends three days navigating is 10 steps from the exit turns around completely nobs out of the situation. It goes to the beginning again. And I just had a meltdown because it's the funniest thing you've ever seen. It's objectively whole area's content. But like I was going to cry. I would love to read it. It's like transcript on that one too. It doesn't even know it's just like this is a disaster. I'm lost. Like best case scenario, I can just like get to the beginning and try again. It's like, yeah, we're so close. On the dialogue box piece, how does Claude actually break out of that? Yeah. Is there tactics or is that like, all right, we got to reset the game and start over sort of thing? Yeah, this gets like the little details of how you can like how I've actually come to understand what it means to build good agents. Because like there's a train of thought that exists for a while of like build the most big complicated system to try to patch every little weird quirk that a model has. And that's actually really hard to do. How I think about it now is I watch the model play. Like I give it a pretty simple straightforward way to play the game. And I watch and I see what goes wrong. And no way to find out that it's going to get stuck for eight hours trying to dismiss dialogue box. Then waking up and seeing it stuck for eight hours with that. And then you can sort of like build a how do I how do I give it the right information it needs to be able to break out of this. So like one really simple thing that actually helps is just like giving it a step count every time it's like taking an action. So saying, hey, like this is action 2,400 your next action will be 2,500 or whatever. And then you can also say like because one of your limitations is you don't have a great sense of time. You may just want to keep track of how long you've been trying to do something. And if there's something you should learn from that fact, like it if you've been trying for something for a really long time, like maybe it's a good idea to reconsider. And that's actually enough in the case of Pokemon to like get it to keep track of how long it's been pressing a if it's been pressing a for 10,000 times. Like if you at least like just by telling it to keep track of that it has some hope of being able to realize this is weird. I should get out of it. This is just like a little like it's literally just like thinking about the information that I need to give Claude to be successful. Like Claude doesn't have any innate sense of time. It does not every time you run it is completely new to Claude. And that's not true for humans. Like we have a great sense of time like the sun goes up and down. It's very easy for us. Right. And so this is just like I've thought a lot about as I've gone on this like learning about what Claude like what affordances it needs to to be able to understand its situational better. And then just providing it with that set of that set of information. And so a lot of the iteration I've done in the last few months on this is just sort of like watch see what it struggles with. Understand is there some like piece of information that I can give Claude that will help it have more of the tools that needs to reason about the situation. And then often that's like the best way to start getting progress on this. Okay. Cool. And that feels very similar to I guess some of the general prompting guidance that we usually give customers right around like, Hey, if you were to write a prompt and give it to somebody and they knew nothing about the situation they're in like a basically a box with no windows. Yep. And then to do this task would they be able to do it? Yep. And if you don't provide all that context. That's right. And this is just kind of the next level of that. That's right, agent. There's a small danger with agents to because the whole reason why you'd ever use an agent is you can't sort of enumerate all of the situations going to get into. If I wanted to be the first part of Pokemon and agent isn't what I would do, I would give it a series of like, here's what you need to do first and here's what you need to do second and here's what you need to do third and things like that. And that was my only goal, right? And why you use an agent as you can't do that. Like it's these scenarios where you don't know what situation is going to be presented in front of you. And you really need to lean on the model to navigate and use its own intuitions or how to do it. And so there's a danger to get too far down that rabbit hole of like, I'm going to try to predict every single thing and write that all into a prompt. You'll see some Frankenstein prompts. If you like try to anticipate every possible thing that the model could end up struggling with, which is why I think it's like just important to be measured and like read a lot. And like the thing I've learned probably the most is just like watch it, read what it says, see what it's struggling with, understand, and just find out like the most minimal ways that you can give it a little bit more context rather than trying to like work around every single detail. That makes sense. Maybe switching gears a little bit here. So we included Claude Place Pokemon as part of our Claude 3.7 on it launch. We had the benchmark with all the lines, how far each model's got through the different gyms. And then we put out like an article kind of explaining it. What was the reaction like just the general public and people more in the AI space? I guess like maybe to tell like a little bit of the backstory. But like they have this prompted it like, you know, as I've been hacking on this, like I we have a Slack channel, a Claude Place Pokemon that I've been like just like posting updates on this on. And this has like had a history of like at first there's a lot of people who just it's fun. It's the nostalgia hit. Like it's the same content of why someone would tune into the stream that we have. It's just like fun to watch. It's exciting. Like it's exciting to see the model they progress. This is kind of our baby is glad. And so like seeing it, you can just say, like reading for it and proud of it. And so that was like the initial like internal traction that had me excited about the project is like people just like had so much fun looking at it. And then there's this like switch that flipped for 3.7s on it, which was like, oh, we're like learning something really interesting about this model. Like this is a thing we really wanted was for a Claude to be able to build better plans and act better over longer periods of time. And like, Pokemon's actually like a pretty reasonable way for us to test this. And so suddenly I actually had like researchers coming to me and saying like, can we measure this? Like can we look at this? Like what what is actually going on here? And so there was this like small breakthrough moment, like a week before we ended up launching the model that was like this might actually be one of the better ways we can tell the story of like what is this thing we were trying to do by making Claude better over longer time frames? And is this a good way for us to understand it and for us to tell the story? And that really like snowballed into a lot of like a how I thought about it. It's like maybe this is actually like a pretty crisp way for us to understand what is 3.7s out good at and how should we use it and be like this is probably an entertaining way to show it to the world too, right? In a pretty tight little sprint, we decided like this is a thing we should put out in the world. We should make a graph that shows how good it is compared to other models. We should put together a twitch stream and let people like see an experience that's in sort of fun and excitement we had seen and also like the feel for what's actually going on here. We should talk about it in our research materials because it helps people understand and that really was like the inspiration of it. I think that just makes like a great point on we wanted to be able to tell a story about how Claude was improving in this kind of dimension and that's getting harder and harder to do as like the models get better. And we're almost having to like move into this regime of equipping the models with like real life things now instead of just like artificial forced test cases and benchmarks. I know anytime you have like a little tiny test case it's like pretty hard like models start getting to 100% fast and we just get excited anytime there's a evaluation that a model is getting 30% on you know like it's getting a third of the way through the game roughly like that's amazing we know something about it's like not good enough to do this but it is good enough to do this that's like pretty good information that was like one of the interesting I never had that in my head as like why I'm doing this but it became I think this was like the moment where it actually was like very interesting to to look at and understand but then you want to talk about like what happened afterwards. Yeah so we launch we launch it we include it in the materials we put out the blog post we start up the Twitch stream and then what happens next. It was a lot more popular than I expected. I guess I shouldn't be too surprised the AI world is pretty exciting these days but it has had like a remarkably large set of people who were like excited to watch an experience and get through the same thing as me for like the first two weeks there were like thousands of people at any given point in the day 24-7 tuned in watching. Which is amazing people were making there's like a subreddit that was started people are making memes in fan art and I saw a song the other day that someone made about it which is amazing. So like I guess at like the most grassroots level it's had this like very fun community. One of the most amazing things to me about it is I don't know I am skeptical of online chat rooms but the chat on it has actually been like really positive and fun and people are like excited for Claude talking about AI talking about agents engaging with this thing. It's been like kind of amazing at how positive and fun it has been to have a community around that. And then the other side that I've found really I guess like rewarding is I think this is just like been a way that people actually can see and understand what an agent is for the first time like the AI world has talked so much about agents in the last year and I think like when you're used to being a person like writes code or uses coding agents like that might ground pretty easily and what's we're talking about. But for a lot of the world I think like it's pretty hard to really understand what that means. And I think the example of Pokemon where it's like oh it's like not just a chat bot where I type in a chat and I get a response back but like it's like doing this on its own and seeing and trying things and taking actions and all of that is a way that I think more people have been able to like latch on to what is this agents thing we're talking about. And I think that's great like I think it hopefully can like bring more people into the dialogue of what are we building here. What are the possibilities of AI. How can I think about how this is going to impact me. How I can use this to the most impact. How I can like accomplish more and more by not just thinking about it as a chat bot where I type in a question and get an answer. But as a collaborator that I can ask to go do something that's complicated that takes time and it can actually do it. And I don't think most people are going to go ask Claud to play Pokemon and see what happens. But like I do think it's been a way where it like resonates a little bit more than what some people have been able to in the past which has been like maybe my favorite thing out of it. Yeah. I love that. I think it's so true. I've had so many people that aren't necessarily in the AI space reach out to me and ask what is this whole thing. Why is Claud playing Pokemon. And when they start to dig into it more, it's like, oh, okay, I kind of understand now where these things are headed. What they're able to do. It gives that more visceral feeling about where models are currently at. And the Twitch chat is absolutely. It's probably one of my favorite things of any of our launches. Just to see all these random strangers from across the world. Like cheering for Claud and making memes out of like Claud when he fails in Mount Moon. Yeah. We launched the stream on the day after we launched the model. And then it was the following Saturday that it finally started going on the path to get out of Mount Moon for a time. And it spent three days there. Yeah. And man, the chat was electric. That was blown up. Crazy. Yeah. Going crazy. I was sitting there on my couch next to my wife who I was ignoring. I'm so sorry to her. And just like glowing as people are just like having so much fun cheering for Claud and rooting it on and like getting so hype. And yeah, I could never have imagined how much fun having like an army of people cheering for Claud would have been. Yeah. Before we put it out there. I think we need some way Claud can interact in the next iteration. I know. Claud deserves to know how many people love it. Yeah. And it should be able to talk back with the chat and like make call outs and everything and maybe go full Twitch streamer. Does Claud have a favorite Pokemon? Ah. So Claud is very tactical, very pragmatic. So there are a few things I'll say. As a starter, Claud really likes to go for Bulbasaur. It doesn't always succeed. Sometimes he gets lost trying to find it. But it's going to Bulbasaur because it has a tight advantage in the first two gyms. Really good strategic choice. So he can get successful in the strategy. Very rational. Claud is the beginning. That said, there are a few Pokemon that like it always seeks out in a run that are like the rare ones. So it loves catching Pikachu. If it ever sees a Pikachu, it gets really obsessed with catching it. It also digs a Calfairi in Mountain Moon. It really likes to go for those. So it likes the rare ones. Okay. When it sees something that it knows is rare, it goes right after that. Yeah, that's like the same strategy my eight year old self. Yeah, yeah, yeah. Pokemon. We're on the same page there, Claud. Last question. Do you have any advice for folks out there that might have watched Claud plays Pokemon or they're just getting started building on top of Claud for how they can start thinking about building their own sets of agents in any take-ways just generally from this whole experience? The biggest thing for me, and I actually think this is like advice across all of adopting AI. Given this advice before in different contexts is like start by doing something you love. Like that does fun. This is not related to AI at all, but it's like I think the difference between people who crush it and like figure out how to adopt AI and not. It's just like a certain amount of time coming to some understanding of what models are good at, what they're bad at, what can I trust it with, how do I really gain trust in this model? And by starting with something that you are excited about, that's fun, that you're going to want to like boot up at 7 pm after a day of work and like actually go hack on. Like that was the thing that made Pokemon so magical for me is like I would get done with work and it was like the first thing I was excited to do, you know, and it meant that I had like so much space to really get to learn and know this model. And there's all sorts of like technical details I can tell you about how to build agents that I've learned, but like more than anything you learn by interacting with an experiencing Claude and finding the way that like it's going to set you best up to be excited to spend six hours with Claude or whatever it is on a week is the thing that I think is going to get you into it because once you've done it once, it's like much easier for me to reason about like how would I go build an agent for something else. And it's also for all the reason we talked about that are translational like the things that Claude is good at in Pokemon actually tell me something about what I can expect about different things in how I use Claude right what I need to look for if it's good enough at this like how to how to think about finding out if Claude can handle this part of my job that I want to automate is the same things same way I went about figuring out like can Claude figure out how to get out of Mount Moon or not you know. And so just that experience and intuition my biggest piece of advice is just like find something that yeah fun with build a relationship with Claude and that will like carry you so much more than any like individual prompting tip or something like that will. I love that. That's great. Well thanks David this is awesome. If you want to follow along with Cloudflase Pokemon, we'll drop a link to the Twitch stream below. I expect that will be continuing to run Claude in future versions of Claude on Pokemon going forward. And thank you for watching.

Lessons on AI agents from Claude Plays Pokemon

TL;DR

Takeaways

Vocabulary

Transcript