- Advanced Large Language Models (LLMs) like Claude Sonnet have enabled a "step function" improvement in AI coding assistants, moving beyond simple completion to sophisticated multi-file edits and agentic workflows.
- Cursor leverages a unique "self-improving recursive feedback loop" by using its own AI-powered product for internal development, accelerating feature iteration and problem-solving.
- The future of software development will see AI involved in nearly all code generation, but human developers' taste, guidance, and expertise in code review and verification will become even more critical.
How Cursor is building the future of AI coding with Claude
- The progression of LLM capabilities, particularly with models like Claude 3.5 Sonnet and Sonnet 4, significantly advanced AI's ability to reason, perform multi-file edits, and act as "agents" within a codebase.
- Cursor's internal development philosophy involves solving its own problems with its product, allowing for rapid iteration and discarding ineffective features before public release.
- AI coding assistance operates on a spectrum:
tab completionfor familiar code,command kfor single-file edits,agentfor multi-file changes, andbackground agentfor complex tasks like entire pull requests (PRs). - The
background agentfeature spawns independent virtual environments where AI can iterate on tasks asynchronously (e.g., fixing failing tests), requiring seamless switching between background and foreground for developer intervention. - A critical future bottleneck for widespread AI-generated code is the
verificationandreviewprocess, ensuring that AI-produced changes are not just correct but also align with human intent and architectural taste. - Tackling large, complex codebases requires advanced
retrieval modelsto provide relevant context, alongside leveragingmemoryandlong context windowsfor LLMs to better understand codebase intricacies. - Code itself is evolving to be more "LLM-friendly," with API designs and code structures simplifying interactions, though core principles of clean and well-structured code remain universally important.
- AI coding tools serve as powerful educational aids, accelerating developers' learning curves by enabling faster iteration and making it easier to experiment and understand complex concepts.
LLM — Large Language Model; an artificial intelligence model trained on vast amounts of text data to understand and generate human-like language.
Agentic Systems — AI systems designed to take multiple, sequential actions in an environment to achieve a goal, often involving reasoning and tool use.
Retrieval Models — Models used to fetch relevant information (e.g., code snippets, documentation) from a large corpus to provide context to an LLM.
Tab Completion — A feature in code editors that predicts and suggests the next few characters or lines of code as the user types, often powered by AI.
Multi-file Edits — The ability of an AI coding assistant to make coordinated changes across several different files in a codebase to implement a feature or fix.
Tool Use — The capability of an LLM to interact with external tools or APIs (e.g., compilers, debuggers, search engines) to perform tasks beyond text generation.
Command K — A common keyboard shortcut in AI coding assistants (like Cursor) to invoke an AI agent for in-line edits or actions on selected code.
Background Agent — An AI agent that operates asynchronously in a separate environment (e.g., a virtual machine) to perform complex, long-running coding tasks like generating a full PR.
Verification of Software — The process of ensuring that a piece of software meets its specifications and functions correctly, especially critical for AI-generated code.
Code Review — The systematic examination of computer source code by human developers to find mistakes, improve quality, and ensure adherence to standards.
Reward Hacking — A phenomenon in AI training where an agent finds unintended shortcuts to maximize its reward signal, rather than achieving the desired behavior.
Long Context — The ability of an LLM to process and retain information over very long input sequences (e.g., thousands or millions of tokens), crucial for understanding large codebases.
I think like every facet of producing software, I think will be kind of changed to use AI in some way. Very excited to have you guys out today. Looking forward to this conversation for a while. As you know, I'm Alex. I lead our Claude relations here at Anthropic. I'm Lucas. I work on Adjentic Systems at Cursor. I'm a mod. I'm one of the founders and I work on ML and Retrieval at Cursor. My name's Jacob Jackson. I work on ML at Cursor. Very, very excited for this conversation and to talk a little bit about Cursor, what you guys are building and also how you're using Claude. It's been a big year for Cursor. Pretty obvious to anyone that's been following along the AI industry. You guys have scaled now to over 300 million revenue in just over a year. Pretty crazy. Millions of developers are now using Cursor. What's changed in your opinion? And how is today in the version of Cursor today different than it was a year ago? Yeah. I think there are a few big things that have changed. I mean, there's always been this massive overhang in given the current level of the language models, how much you can do with them. And I think Cursor was probably one of the first companies, at least in coding, to be able to close that gap a bit with a number of different features. And then in turn, you've also seen these models get much, much better coding. And I think 3.5 Sonnet was like the first clear example of this or this kind of step function, better in programming. And so before then, Cursor is really useful with things like tab completion, predicting your next edit. And that alone was growing fairly quickly. And then editing within single files. But we did see that when you kind of mix the intelligence of a model like 3.5 Sonnet, with a few other kind of custom models we use for retrieval and then applying the edits made by this larger model, you now have the ability to do kind of multi-file edits. And I think that was the kind of the step function that resulted in a massive adoption of Cursor. And since then, it's been a mix of the models getting better than us trying to under the hood get better and better with like how far we can push these models. And was that a natural progression? And something that kind of just arose? Or did you guys notice when 3.5 Sonnet that first one came out that Holy Cow. Now we can all of a sudden do all these different things that were possible before. What did that kind of look like? It did feel somewhat gradual. Like there are these steps in model quality. But you saw hints of it with the prior state of the art model. In fact, we've been notoriously bad at taste testing these models just because the way we use them is very different than when you put it out into the world and see how others use it. But there are just hints of over time, each kind of new model that came out was better and better at being able to reason, do more agentic types of coding. And then it's a lot of tinkering and trying lots of things, seeing it work, seeing what fails. Yeah, I think Sonnet was probably the first one where we were able to make the multi-file kind of interaction really work well. And since then there's been a number of step functions including like tool use, right? And then you can actually have these models act like real agents within the editor. I see. So the progression of the new models, new capabilities over time kind of allows for further tinkering, exploring, which then rolls back into your product and some degree and allows you to build new features. Yeah. That's interesting and kind of parlay's into this next question I want to hit at. Which is, I've heard many stories of how your team is using cursor to build cursor. It's in this like self-improving recursive feedback loop. First off, maybe you can dive into a little bit of how that looks and on a day-to-day, what does that look like within cursors? You guys are working on building new features. Yeah, I think it very much depends on the individual, like, yeah, use cases for each employee. And I think it also very much depends on what part of the product you might be working on and what kind of stage that part is in. So I think for like initially laying out some code base, some new feature, it's very, very useful to just use the agent feature to kind of get that started. And then to maybe use the thinking models to, like, look at individual box that you might be facing. And then for making a very precise edits, I think that's a lot of tap also. And then when initially getting started with a code base that one might not be too knowledgeable about that, using kind of the QA features a lot, using a lot of search. And I think that's also something that Claude III, VII and III-V also has been excelling at doing research in a code base and figuring out how certain things interact with each other. I see. So using these features to explore your code bases makes the process easier. And then you learn as you're using these features that, oh, there's a deficiency in this area, we should go work on that. Yeah, I think cursors are very much driven by kind of solving our own problems and kind of figuring out where we struggle solving problems and making cursors better. And then you have figuring out what we can do there and then experimenting a lot. We very much have this philosophy of like, everybody can just try things and try adding new features to the product and then see internally how they are used and what kind of feedback they gather. Do you think there are maybe over a more meta level, there's an advantage to being your own best customer internally? I think 100%. I think that's how we're able to move really quickly in building your features. And then throwing away things that clearly don't work because we can be really honest to ourselves of whether we find it useful and then not have to ship it out to users, track how people use it before deciding to go ahead with the feature. I think it just speeds up the iteration loop for building features. Yeah, going back to overall how we use a out of program, it feels like, I mean, there's a lot of diversity within the company and how different people use it. I think it differs first in the kind of work you're doing. So there are a number of people that will, for example, be working in pieces of the code based on really familiar. And at that point when you have it all in your head, it's often faster for you to kind of convey intent just by typing and then for that, cab is really useful, it kind of speeds you up there. But then when you're in places where you're less familiar or you need to write out a lot of code, you can kind of offload a lot of that and often some of the reasoning to these models. And then as you got to places where you're really unfamiliar with the way in which Lucas is describing and you're kind of coming into a new code base, it's just there's this massive set function that you get from using these models. And what we kind of see is over time as the models get better and at cursor gets better using these models, you do a better, better job of when you're more in flow and when you have more knowledge of the code base. So there's a variation in when a feature is most applicable to like your use case and it kind of is like almost a spectrum to some degree. Yeah, like the spectrum on one end is tab for when you're completely in control and you know what you're doing. Then it goes to a command k we're editing a single given region, maybe a whole file and then at the other end you've agent, which is quite good for you know, editing multiple files. And then at the very end, you kind of have this background agent which we've been working on. And that can be useful for basically doing entire PRs. You guys just released a preview of background agent. What is background agent? I think it's clear that the models are getting better and better at doing end and tasks, but they're not quite at 100%. And I think it'll take a while to get to 100%. So the way you speed up developers, right, is you let them do these things in parallel, but as opposed to kind of letting it just go in the background and spin up a PR that you look at and get hub. If it's only 90% of the way there, you want to go in and then take control and do the rest of it. And then you want to use, you know, the features of cursor in order to do that. So really being able to quickly move between the background and the foreground is really important. And I think like, you know, we're in the early end of this feature. And I can imagine that there are lots of interesting ways of being able to operate, for example, on three or four changes at the same time and then quickly kind of popping them to the background and then moving them to the foreground. It'll be interesting to see how this changes, how people use cursor and just like develop software in general. When we see background agents basically as a new primitive that we can use in like so many different places and the current way of exposing it is quite straightforward where you can just get it prompt and push it to the background and then it independently iterates on that. But there can be like many more integrations how these things can be spawned off. And I think there's a lot of probably what you want can can make from that. So is this taking your code base in putting it in virtual machine or what exactly is that transfer that's happening? Exactly. Yeah. We spawn off independent environments that have all the developer environment utilities already installed and then the agent can can use those and it has all the various code extensions that are available and through that it can get linear errors, etc. I know we're kind of witnessing this trend of asynchronous tasks, background tasks across many different things from coding to like research. In your view, what does that look like as this progresses to where we might have thousands of these agents potentially going off and you can see like whole teams of agents attacking a problem all in the background. Was that future look like? I think the next bottleneck you'll run into is verification of software, verification of code. Model is getting really, really good at generating writing lots of code. But let's say developers spend all throughout some random mission numbers but 30% of their time writing code or 30% of their time reviewing code, 70% of their time writing code. If you completely solve writing code, you still haven't really sped up software engineering by more than a factor of three. Yeah. So I think we're going to need to figure out how to make it easier for people to review code, how to be confident that the agent's making the changes that are not just correct. Correct can be vague, right? It may just be in the thing you specified, it was under specified enough that it actually did like the best that was possible for even the best human programmers to do. But was it actually what you had in your mind's eye? And so making the process review much, much better I think will be really, really important. And it's something that we're really interested in as well. Any early ideas there on what that looks like? I think there are few floating around from various people at the company. One that Michael or CIO really, really likes is the idea of operating at a different representation of the code base. So maybe it looks like pseudocode. And if you can represent changes in this really concise way and you have guarantees that it maps claimly onto the actual changes made in the real software, that's just where in the time of verification of time. But that's one possible route. I think the reason why quote and quote vibe coding works often is because the process of verification is like really easy. Since all it is is just kind of playing with the software. But you make a change and you actually play with whatever software you've built. I think it's just going to be really hard to do for real production code bases. And cracking that problem is really important. That's a good question around the difference between like a standalone thing. They might be vibe coding versus a production code base that has millions and millions of lines of files. How do you guys see the difference between those two in your mind? Where are we at in terms of like working within them with current models? I think I have something we've thought about a lot with backer and agent because something that's really simple and obviously should be very easy with these models is I have this test here. Can you, the test is currently failing. Can you fix the code so that it passes? And it's like, okay, how do we make that happen? Well, the model needs to be able to run the test. And if you have a very simple repository that's very simple. But when you start getting to these larger enterprise code bases, it can be complex to get the dependencies set up properly so that the model can run the test. But this is something we've thought about with backer and agent a lot is how do you make this process straightforward for the developer to create this environment where the agent can run the tests and then make it repeatable. So you can snapshot it and you can quickly update it when your code state changes. And this unlocks the ability to spin off of VM in the background, have the model make experiments and some of them will make it pass and some of them won't. And then eventually you as a developer only have to worry about the case where it succeeded. And there's just a lot of infrastructure there and a lot of user experience that is important to get right. And then I think there are other fundamental problems. So one way is you get the model to try to pass the test. That's how you can kind of guarantee maybe some sort of crackness. But with these large code bases, you're often dealing with things that almost look like their own language where they have these kind of DSLs within some languages and everything is done in this particular way. And it's really sprawled out across millions of files, which is hundreds of millions of tokens potentially, maybe more. We've done a number of things to make this much better, which include training retrieval models and then integrating other sources of context as well. For example, you can imagine there's a lot of richness in the recent changes that you've made when editing your code. It kind of indicates what you're working towards. There could be richness in the changes that other people on your team have made in your code base, especially recently and using those as hints. But I do think it's still this really hard fundamental problem of just giving the model access to really good retrieval feels insufficient for having the model really understand the code base. I think it's a problem we're really interested in solving. Probably through some combination of memory plus long context and other things. I think memory is one interesting approach people have taken to get the model to kind of learn from your usage of it. But it also feels like it's a small boost in performance. And it feels fairly primitive relative to where we need to be in order to get things that are excellent at large code bases. Yeah, in large code bases it's not only just about getting the tests to pass, but it also is about doing it the right way. Looking at the existing code and making that match the new code and bringing it into the correct structure and kind of using all the guidelines correctly. We've been trying very hard to kind of make that happen through cursor rules, through integrating different types of contexts, etc. Yeah. I could write a debounce function from scratch and just use that and that'll make the test pass. But that's not the right way to do it. You should use one of the debounce and maybe there's three or four debounce functions used across the code base. How do you know what the right one is to use? Maybe the only reason like someone knows is because they message someone on Slack that there's how you do it. And so I think yeah, it gets really, really hard to solve these problems with extremely large code bases. That's interesting. So there's also kind of an element to the org knowledge that lives outside of the code base itself. And that plays a major factor sometimes. Some of these decisions, especially as you're operating on large code bases. I don't think that's the bottleneck today. But I think if you solve, like if you made models like perfect, kind of knowing the code base, I think you'll immediately, like you'll maybe get like a 5X, maybe 10X improvement, but you can't get farther than that because now it's completely bottlenecked by how, how much does it know these things that are never ever explicitly mentioned or shown in like the PRs and the actual state of the code. And then there are also just outside concerns from the business side, from sales, et cetera. And those kind of have to be brought into cursor to make that work. Right. So some future version of cursor then has to play into many more systems. Be clear, I think like, you know, that's like still some ways away for that to be like really, really critical relative to the other things. I think we have a long ways to go still on just using the interactions users have like details of their code base and commits made in order to make cursor much better. One interesting thing I've started to notice at least with like web pages and content is people trying to now think about how to optimize the page for an LLM reading and browsing it. Do you think we're going to see something similar maybe with code and in that code could transform how it usually is written and what it looks like if you're writing for primarily human reviewers and humans working within a code base to models? I think that's totally the case already. I mean, API design is already adjusting such that LLM's are more comfortable with that. For example, changing not only the version number internally, but making it like very visible to the model that this is a new version of some software, just to make sure that the API is used correctly. And I think that the same also holds for for like normal code bases and internal libraries as well like structuring the code in a way where one doesn't have to go through like n-level of interactions, but maybe just to two levels of interaction makes yeah, LLM models better at working with that code base. But I think ultimately the principles of cleaning software are not that different when you want it to be read by people and by models. You know, when you are trying to write claim code, you want to not repeat yourself, not make things more complicated than they need to be. And that is just important for models is it is for people. And I think taste in code and what's a clean solution that's not more complicated than it needs to be is actually going to become even more important as these models get better because it will be easier to write more and more code and so it will be more and more important to structure it in a tasteful way. That's a really good point on taste. Taste is kind of this thing that I feel like maybe some people are born with more taste than others, but generally you kind of develop taste through experience and learning what works and seeing failures and seeing successes. In a world where we're having AI write more and more of our code, there's been real pushback against some that say, oh, you're going to make programmers lazy or you're not going to give juniors a chance to learn what it actually looks like to work within a large code base and do all these things. How do you think about balancing this sort of automation or assistance in this case with also preserving the core engineering skills that maybe a software engineer has to go through? There's like trials and tribulations. I think these tools are very good educationally as well and they can help you become a great programmer. You know if you have a question about how something works if you want some concept, explain to you. Now you can just press command L and ask Claude, what is this, how does it work? You can explain it to me and I think that's very valuable. It does make it easier to write more code and do more stuff and that can result in higher and lower quality code being out there, that is true. But I think in general it's a very, very powerful tool that will raise the bar. I think quality comes very much from iterating quickly, making mistakes, figuring out why certain things failed and I think models vastly accelerate this iteration process and can actually through that make you learn more quickly what works and what doesn't. So I think in the long term it's a super helpful tool for developers just getting started and working on bigger and bigger projects and figuring out what works and what doesn't. Yeah, I think it'll be really interesting to see how programming evolves. I think you'll still for a very long time need to have the engineers that know the details, right? Can go into the weeds. I wonder how much you'll start to see people that are now learning programming who don't know many of the details but can still be fairly effective. I think today you still do need to know a lot of the details. I think over time you might have a class of software engineers that need to know very few of the low level details and it's still operated at a higher level and maybe looks a lot more like kind of thinking through like the taste is like more in kind of UX taste, right? Like what does like let's say you're trying to build something like a notion right at the end of the day, I don't think you can offload that entire thing to language model. You need to have the you need to kind of describe like okay when I do this type of interaction then I expect it to pop up in this particular way. Right maybe you don't have to get to the details of writing pure software that does that but still describing those interactions, describing the way this thing roughly works. That is a form of programming. Switching gears a little bit on the topic of models. So we just recently by the time this video comes out, Claude Opus 4 and Claude Sonnet 4 will be out into the world. Love to hear your guys thoughts on the new models and how you're starting to think about integrating them with Incarcer. I mean we've really enjoyed the new models. I think we were pretty shocked trying out the new Sonnet because I think 3.7 was a fantastic model. It was better at agente coding but everyone knew it kind of had these deficits right where it would maybe be a little bit too overeager. Like to do a lot. Yeah. Would like to change the test sometimes. Yeah. We found that Sonnet 4 has effectively fixed all of those and is much better. And then the intelligence has also been a big step up where you know you've seen other models that are kind of steps up in intelligence, maybe not as like strongest agente coding but like you know O3 is an example and we found it goes so to total with that despite being you know a much cheaper model. And so we're extremely excited for Opus because we think it'll be a fantastic agent to use in the background. Yeah. That's awesome to hear. The tests writing and over-eagerness things are things that we were trying to tackle pretty intensely with these models and concept of like reward hacking in which the models will find some way to basically take a shortcut to get to the final reward in RL. We've done a lot of work to cut that down. I think we cut it down by like 80% in these new models. I'm really curious to hear how did 3.5 Sonnet come about because that felt like the first kind of punch like this is like a really good coding model for Anthropic. How did it come about? We trained it. Just was good. Yeah, I think we have always known for a while that probably since the Genesis of the company that we wanted to make models really good at coding. It just seems important for everything else that you do especially as you make more models. 3.5 Sonnet was I mean I think 3 Opus was a really good coding model as well especially for its time but 3.5 Sonnet was the first time that we really put a strong dedicated effort to hey let's get these models good at coding but not just specifically coding this sort of longer horizon coding where it's having to do these things like you're mentioning earlier in the conversation around making edits on different files going off and like taking a command here, calling a tool and then going and making change somewhere else. That was the first model which we could kind of put all these things together and I think it just turned out really well and kind of set the stage for what our future models would be. And how do you guys think about code versus other areas where you want Sonnet to excel? Yeah, and Opus to excel. I mean code is one of the primary areas but I think it's not the only area. I think there is a good amount of transfer that you see from models getting really good at code to them just getting better at reasoning over taking many actions and working in this sort of agentic way and that carryovers is pretty nice as you're dealing with applications that might mix in code but also have to go retrieve knowledge from other places or do research. Generally we're about just pushing the frontier as much as we can with our models. Of course there is like considerations that we make around safety and making sure that the models are in line with what you as a user want and also what we believe the models should be doing but generally we want to keep pushing limits of what these models can do and kind of show developers in the world. This is what models are capable of. So things like computer use when we unveiled that back in October. That was like another direction in which we're really pushing forward in terms of how can a model be good actually navigating something that is primarily a human interface right. So it's not in the world of like APIs or tool calls or anything like that. It's literally just looking at an image as a human would then having to direct an action onto that screen. There's also a strong part to how we think about these models character as it's known now Amanda Askel is one of our lead researchers on this effort kind of crafting Claude's character. We put a lot of thought and consideration into what Claude should feel like and sound like and what does it mean for an AI to play a really prominent role in somebody's life not as just a coding agent but as kind of like their confidant in a sense and an entity that you're going to be spending a lot of time talking to. So that's also really fact in into all the decisions from make around these models and how we train them. How does Anthropic as a whole think about where things are going both in terms of self engineering and that in terms of like research like in terms of how much like these models will augment replace do a lot of this work. Yeah it can speak personally here. So personally I think that we're not going to be replacing as we've talked about earlier. There's just like so much more you can do now that you have like models that can do all this you know not symbols like typing of the code basically for you. I see this with myself too like I studied computer science in college and did software engineering and now I feel like I'm at the point where the models are like better at producing code than I am. Like if I were to just like think about doing like a leak code problem or anything like that where it's like a contained environment in the model of the right code it's going to like beat me and yet I feel like I can do more than ever. I can make prototypes of anything I can like spin up demos super super fast if I want to like show off a new concept it's felt very empowering in that sense then like taking away or dismissive of of what I've been doing and it is similar to where I feel like just because I have that knowledge of software engineering from the past I can actually exploit it much better and I can use the model I can push it farther than if I just still didn't have any idea about what code is. Maybe on that getting more into like the sort of fun future speculation. I want to ask like maybe a practical question in a few years we can come back to this one and see how we how we turned out January 1st 2027 so what is that little less than two years from now. What percentage of code do you think will be AI generated and following that what does the day in the life of somebody that's considering themselves to develop or now look like. I think it's similar to going back to let's say before I was born but you know 1995 and asking a lawyer in the future what percentage of legal documents will be word process are generated and the answer is 100% or you know close to 100% in that AI will be involved in almost all of the code that gets written but still your role as a lawyer or as a developer in understanding what the code needs to do and having taste and guiding what is done with the software is going to be more important than ever. I mean already a cursor it's probably 90% plus but that's because a large fraction of it is using more higher level features like occasioned and man can't what not but then a lot of it is you're typing and then tab will as you type do 70% of that right. So in the cases where you're actually going and doing it manually yourself tab is still doing most of those changes right. So the actual letters typed is like a very big help. Yeah but I think like every facet of producing software I think will be kind of changed to use AI in some way. Do you think we ever get to a world in which you basically have software on demand what does that look like. I think you're going to see people building software people in organizational functions building software who are not previously building software you know like people in sales who would not have built their own tools before will now be building for example dashboards to track what's important to them and going back to how taste becomes more important than ever you know now you can build the dashboard but you still need to decide what metrics the dashboard is going to show it doesn't prevent you from having to decide that. I think you're going to see many more people building their own software but it will be bottlenecked on having a unique thing that you want to do with the software that isn't properly served by existing needs. One example I like to tell people is we've got our our comms team who has actually been like shipping bug fixes to Claude ii which is just like absolutely insane like he's in a completely different part of the org. He's not such a product at all and yet he pops in with like a PR and he's like asking for a stamp and you're like what are you doing and it's like yeah he's using you know Claude Code or some coding tool with Claude as the base model there to like fix bugs in a production code base. I think that's amazing as well and it ties back into this like general hey if you have taste if you've good intuitions like you're just going to be able to do a lot. That's kind of how I see the world keeping progressing. I think things will change and like roles will look much different in five years ten years but generally like i'm very much in favor of like if you can do more with these things like that's generally always going to be a good thing. Yeah I feel like there are a lot of interesting paths that this could take. One is just completely on the fly on demand software where I am using my own version of some app and just like as I use it you know this interaction I don't really like and it just changes for me that's one kind of crazy future you can imagine or it's not even you kind of actively doing it but just based on your interactions with it the software whatever you're using changes to kind of fit you. That's like a cool potential path forward where I don't know if everyone in the world is going to want to like I don't know if the total size of like people want to kind of build their own software is like that large right. But adding the people who could benefit from software that kind of fits their needs is potentially the entire world. All right maybe one last thing to just kind of close us off here for all the people watching this if you're a talented engineer out there and you're thinking about making next move or you want to get more involved in the industry you're trying to decide between maybe going to a larger company or joining more of a task-based startup like a cursor and a topic. What would you tell someone in those shoes? Yeah I think start of having to advantage these days like with Anthropic and with cursor and getting like really excellent talent in a way that like when you're a bigger company a lot of people you know a lot of the best people in the world find that less exciting right and some people do and certainly like large companies have great people but the the density of that talent tends to be much lower and I get a start of you can get this really high talent density and that makes it really enjoyable to work with a bunch of other excellent colleagues you can work on really impactful things in this incredibly small team right building a product that kind of and building models that change the way that the world writes software and you can be a one of like you know tens hundreds or thousands of people working on that and that's really cool yeah that's great well thank you guys has been awesome conversation thank you.
TL;DR
- Advanced Large Language Models (LLMs) like Claude Sonnet have enabled a "step function" improvement in AI coding assistants, moving beyond simple completion to sophisticated multi-file edits and agentic workflows.
- Cursor leverages a unique "self-improving recursive feedback loop" by using its own AI-powered product for internal development, accelerating feature iteration and problem-solving.
- The future of software development will see AI involved in nearly all code generation, but human developers' taste, guidance, and expertise in code review and verification will become even more critical.
Takeaways
- The progression of LLM capabilities, particularly with models like Claude 3.5 Sonnet and Sonnet 4, significantly advanced AI's ability to reason, perform multi-file edits, and act as "agents" within a codebase.
- Cursor's internal development philosophy involves solving its own problems with its product, allowing for rapid iteration and discarding ineffective features before public release.
- AI coding assistance operates on a spectrum:
tab completionfor familiar code,command kfor single-file edits,agentfor multi-file changes, andbackground agentfor complex tasks like entire pull requests (PRs). - The
background agentfeature spawns independent virtual environments where AI can iterate on tasks asynchronously (e.g., fixing failing tests), requiring seamless switching between background and foreground for developer intervention. - A critical future bottleneck for widespread AI-generated code is the
verificationandreviewprocess, ensuring that AI-produced changes are not just correct but also align with human intent and architectural taste. - Tackling large, complex codebases requires advanced
retrieval modelsto provide relevant context, alongside leveragingmemoryandlong context windowsfor LLMs to better understand codebase intricacies. - Code itself is evolving to be more "LLM-friendly," with API designs and code structures simplifying interactions, though core principles of clean and well-structured code remain universally important.
- AI coding tools serve as powerful educational aids, accelerating developers' learning curves by enabling faster iteration and making it easier to experiment and understand complex concepts.
Vocabulary
LLM — Large Language Model; an artificial intelligence model trained on vast amounts of text data to understand and generate human-like language.
Agentic Systems — AI systems designed to take multiple, sequential actions in an environment to achieve a goal, often involving reasoning and tool use.
Retrieval Models — Models used to fetch relevant information (e.g., code snippets, documentation) from a large corpus to provide context to an LLM.
Tab Completion — A feature in code editors that predicts and suggests the next few characters or lines of code as the user types, often powered by AI.
Multi-file Edits — The ability of an AI coding assistant to make coordinated changes across several different files in a codebase to implement a feature or fix.
Tool Use — The capability of an LLM to interact with external tools or APIs (e.g., compilers, debuggers, search engines) to perform tasks beyond text generation.
Command K — A common keyboard shortcut in AI coding assistants (like Cursor) to invoke an AI agent for in-line edits or actions on selected code.
Background Agent — An AI agent that operates asynchronously in a separate environment (e.g., a virtual machine) to perform complex, long-running coding tasks like generating a full PR.
Verification of Software — The process of ensuring that a piece of software meets its specifications and functions correctly, especially critical for AI-generated code.
Code Review — The systematic examination of computer source code by human developers to find mistakes, improve quality, and ensure adherence to standards.
Reward Hacking — A phenomenon in AI training where an agent finds unintended shortcuts to maximize its reward signal, rather than achieving the desired behavior.
Long Context — The ability of an LLM to process and retain information over very long input sequences (e.g., thousands or millions of tokens), crucial for understanding large codebases.
Transcript
I think like every facet of producing software, I think will be kind of changed to use AI in some way. Very excited to have you guys out today. Looking forward to this conversation for a while. As you know, I'm Alex. I lead our Claude relations here at Anthropic. I'm Lucas. I work on Adjentic Systems at Cursor. I'm a mod. I'm one of the founders and I work on ML and Retrieval at Cursor. My name's Jacob Jackson. I work on ML at Cursor. Very, very excited for this conversation and to talk a little bit about Cursor, what you guys are building and also how you're using Claude. It's been a big year for Cursor. Pretty obvious to anyone that's been following along the AI industry. You guys have scaled now to over 300 million revenue in just over a year. Pretty crazy. Millions of developers are now using Cursor. What's changed in your opinion? And how is today in the version of Cursor today different than it was a year ago? Yeah. I think there are a few big things that have changed. I mean, there's always been this massive overhang in given the current level of the language models, how much you can do with them. And I think Cursor was probably one of the first companies, at least in coding, to be able to close that gap a bit with a number of different features. And then in turn, you've also seen these models get much, much better coding. And I think 3.5 Sonnet was like the first clear example of this or this kind of step function, better in programming. And so before then, Cursor is really useful with things like tab completion, predicting your next edit. And that alone was growing fairly quickly. And then editing within single files. But we did see that when you kind of mix the intelligence of a model like 3.5 Sonnet, with a few other kind of custom models we use for retrieval and then applying the edits made by this larger model, you now have the ability to do kind of multi-file edits. And I think that was the kind of the step function that resulted in a massive adoption of Cursor. And since then, it's been a mix of the models getting better than us trying to under the hood get better and better with like how far we can push these models. And was that a natural progression? And something that kind of just arose? Or did you guys notice when 3.5 Sonnet that first one came out that Holy Cow. Now we can all of a sudden do all these different things that were possible before. What did that kind of look like? It did feel somewhat gradual. Like there are these steps in model quality. But you saw hints of it with the prior state of the art model. In fact, we've been notoriously bad at taste testing these models just because the way we use them is very different than when you put it out into the world and see how others use it. But there are just hints of over time, each kind of new model that came out was better and better at being able to reason, do more agentic types of coding. And then it's a lot of tinkering and trying lots of things, seeing it work, seeing what fails. Yeah, I think Sonnet was probably the first one where we were able to make the multi-file kind of interaction really work well. And since then there's been a number of step functions including like tool use, right? And then you can actually have these models act like real agents within the editor. I see. So the progression of the new models, new capabilities over time kind of allows for further tinkering, exploring, which then rolls back into your product and some degree and allows you to build new features. Yeah. That's interesting and kind of parlay's into this next question I want to hit at. Which is, I've heard many stories of how your team is using cursor to build cursor. It's in this like self-improving recursive feedback loop. First off, maybe you can dive into a little bit of how that looks and on a day-to-day, what does that look like within cursors? You guys are working on building new features. Yeah, I think it very much depends on the individual, like, yeah, use cases for each employee. And I think it also very much depends on what part of the product you might be working on and what kind of stage that part is in. So I think for like initially laying out some code base, some new feature, it's very, very useful to just use the agent feature to kind of get that started. And then to maybe use the thinking models to, like, look at individual box that you might be facing. And then for making a very precise edits, I think that's a lot of tap also. And then when initially getting started with a code base that one might not be too knowledgeable about that, using kind of the QA features a lot, using a lot of search. And I think that's also something that Claude III, VII and III-V also has been excelling at doing research in a code base and figuring out how certain things interact with each other. I see. So using these features to explore your code bases makes the process easier. And then you learn as you're using these features that, oh, there's a deficiency in this area, we should go work on that. Yeah, I think cursors are very much driven by kind of solving our own problems and kind of figuring out where we struggle solving problems and making cursors better. And then you have figuring out what we can do there and then experimenting a lot. We very much have this philosophy of like, everybody can just try things and try adding new features to the product and then see internally how they are used and what kind of feedback they gather. Do you think there are maybe over a more meta level, there's an advantage to being your own best customer internally? I think 100%. I think that's how we're able to move really quickly in building your features. And then throwing away things that clearly don't work because we can be really honest to ourselves of whether we find it useful and then not have to ship it out to users, track how people use it before deciding to go ahead with the feature. I think it just speeds up the iteration loop for building features. Yeah, going back to overall how we use a out of program, it feels like, I mean, there's a lot of diversity within the company and how different people use it. I think it differs first in the kind of work you're doing. So there are a number of people that will, for example, be working in pieces of the code based on really familiar. And at that point when you have it all in your head, it's often faster for you to kind of convey intent just by typing and then for that, cab is really useful, it kind of speeds you up there. But then when you're in places where you're less familiar or you need to write out a lot of code, you can kind of offload a lot of that and often some of the reasoning to these models. And then as you got to places where you're really unfamiliar with the way in which Lucas is describing and you're kind of coming into a new code base, it's just there's this massive set function that you get from using these models. And what we kind of see is over time as the models get better and at cursor gets better using these models, you do a better, better job of when you're more in flow and when you have more knowledge of the code base. So there's a variation in when a feature is most applicable to like your use case and it kind of is like almost a spectrum to some degree. Yeah, like the spectrum on one end is tab for when you're completely in control and you know what you're doing. Then it goes to a command k we're editing a single given region, maybe a whole file and then at the other end you've agent, which is quite good for you know, editing multiple files. And then at the very end, you kind of have this background agent which we've been working on. And that can be useful for basically doing entire PRs. You guys just released a preview of background agent. What is background agent? I think it's clear that the models are getting better and better at doing end and tasks, but they're not quite at 100%. And I think it'll take a while to get to 100%. So the way you speed up developers, right, is you let them do these things in parallel, but as opposed to kind of letting it just go in the background and spin up a PR that you look at and get hub. If it's only 90% of the way there, you want to go in and then take control and do the rest of it. And then you want to use, you know, the features of cursor in order to do that. So really being able to quickly move between the background and the foreground is really important. And I think like, you know, we're in the early end of this feature. And I can imagine that there are lots of interesting ways of being able to operate, for example, on three or four changes at the same time and then quickly kind of popping them to the background and then moving them to the foreground. It'll be interesting to see how this changes, how people use cursor and just like develop software in general. When we see background agents basically as a new primitive that we can use in like so many different places and the current way of exposing it is quite straightforward where you can just get it prompt and push it to the background and then it independently iterates on that. But there can be like many more integrations how these things can be spawned off. And I think there's a lot of probably what you want can can make from that. So is this taking your code base in putting it in virtual machine or what exactly is that transfer that's happening? Exactly. Yeah. We spawn off independent environments that have all the developer environment utilities already installed and then the agent can can use those and it has all the various code extensions that are available and through that it can get linear errors, etc. I know we're kind of witnessing this trend of asynchronous tasks, background tasks across many different things from coding to like research. In your view, what does that look like as this progresses to where we might have thousands of these agents potentially going off and you can see like whole teams of agents attacking a problem all in the background. Was that future look like? I think the next bottleneck you'll run into is verification of software, verification of code. Model is getting really, really good at generating writing lots of code. But let's say developers spend all throughout some random mission numbers but 30% of their time writing code or 30% of their time reviewing code, 70% of their time writing code. If you completely solve writing code, you still haven't really sped up software engineering by more than a factor of three. Yeah. So I think we're going to need to figure out how to make it easier for people to review code, how to be confident that the agent's making the changes that are not just correct. Correct can be vague, right? It may just be in the thing you specified, it was under specified enough that it actually did like the best that was possible for even the best human programmers to do. But was it actually what you had in your mind's eye? And so making the process review much, much better I think will be really, really important. And it's something that we're really interested in as well. Any early ideas there on what that looks like? I think there are few floating around from various people at the company. One that Michael or CIO really, really likes is the idea of operating at a different representation of the code base. So maybe it looks like pseudocode. And if you can represent changes in this really concise way and you have guarantees that it maps claimly onto the actual changes made in the real software, that's just where in the time of verification of time. But that's one possible route. I think the reason why quote and quote vibe coding works often is because the process of verification is like really easy. Since all it is is just kind of playing with the software. But you make a change and you actually play with whatever software you've built. I think it's just going to be really hard to do for real production code bases. And cracking that problem is really important. That's a good question around the difference between like a standalone thing. They might be vibe coding versus a production code base that has millions and millions of lines of files. How do you guys see the difference between those two in your mind? Where are we at in terms of like working within them with current models? I think I have something we've thought about a lot with backer and agent because something that's really simple and obviously should be very easy with these models is I have this test here. Can you, the test is currently failing. Can you fix the code so that it passes? And it's like, okay, how do we make that happen? Well, the model needs to be able to run the test. And if you have a very simple repository that's very simple. But when you start getting to these larger enterprise code bases, it can be complex to get the dependencies set up properly so that the model can run the test. But this is something we've thought about with backer and agent a lot is how do you make this process straightforward for the developer to create this environment where the agent can run the tests and then make it repeatable. So you can snapshot it and you can quickly update it when your code state changes. And this unlocks the ability to spin off of VM in the background, have the model make experiments and some of them will make it pass and some of them won't. And then eventually you as a developer only have to worry about the case where it succeeded. And there's just a lot of infrastructure there and a lot of user experience that is important to get right. And then I think there are other fundamental problems. So one way is you get the model to try to pass the test. That's how you can kind of guarantee maybe some sort of crackness. But with these large code bases, you're often dealing with things that almost look like their own language where they have these kind of DSLs within some languages and everything is done in this particular way. And it's really sprawled out across millions of files, which is hundreds of millions of tokens potentially, maybe more. We've done a number of things to make this much better, which include training retrieval models and then integrating other sources of context as well. For example, you can imagine there's a lot of richness in the recent changes that you've made when editing your code. It kind of indicates what you're working towards. There could be richness in the changes that other people on your team have made in your code base, especially recently and using those as hints. But I do think it's still this really hard fundamental problem of just giving the model access to really good retrieval feels insufficient for having the model really understand the code base. I think it's a problem we're really interested in solving. Probably through some combination of memory plus long context and other things. I think memory is one interesting approach people have taken to get the model to kind of learn from your usage of it. But it also feels like it's a small boost in performance. And it feels fairly primitive relative to where we need to be in order to get things that are excellent at large code bases. Yeah, in large code bases it's not only just about getting the tests to pass, but it also is about doing it the right way. Looking at the existing code and making that match the new code and bringing it into the correct structure and kind of using all the guidelines correctly. We've been trying very hard to kind of make that happen through cursor rules, through integrating different types of contexts, etc. Yeah. I could write a debounce function from scratch and just use that and that'll make the test pass. But that's not the right way to do it. You should use one of the debounce and maybe there's three or four debounce functions used across the code base. How do you know what the right one is to use? Maybe the only reason like someone knows is because they message someone on Slack that there's how you do it. And so I think yeah, it gets really, really hard to solve these problems with extremely large code bases. That's interesting. So there's also kind of an element to the org knowledge that lives outside of the code base itself. And that plays a major factor sometimes. Some of these decisions, especially as you're operating on large code bases. I don't think that's the bottleneck today. But I think if you solve, like if you made models like perfect, kind of knowing the code base, I think you'll immediately, like you'll maybe get like a 5X, maybe 10X improvement, but you can't get farther than that because now it's completely bottlenecked by how, how much does it know these things that are never ever explicitly mentioned or shown in like the PRs and the actual state of the code. And then there are also just outside concerns from the business side, from sales, et cetera. And those kind of have to be brought into cursor to make that work. Right. So some future version of cursor then has to play into many more systems. Be clear, I think like, you know, that's like still some ways away for that to be like really, really critical relative to the other things. I think we have a long ways to go still on just using the interactions users have like details of their code base and commits made in order to make cursor much better. One interesting thing I've started to notice at least with like web pages and content is people trying to now think about how to optimize the page for an LLM reading and browsing it. Do you think we're going to see something similar maybe with code and in that code could transform how it usually is written and what it looks like if you're writing for primarily human reviewers and humans working within a code base to models? I think that's totally the case already. I mean, API design is already adjusting such that LLM's are more comfortable with that. For example, changing not only the version number internally, but making it like very visible to the model that this is a new version of some software, just to make sure that the API is used correctly. And I think that the same also holds for for like normal code bases and internal libraries as well like structuring the code in a way where one doesn't have to go through like n-level of interactions, but maybe just to two levels of interaction makes yeah, LLM models better at working with that code base. But I think ultimately the principles of cleaning software are not that different when you want it to be read by people and by models. You know, when you are trying to write claim code, you want to not repeat yourself, not make things more complicated than they need to be. And that is just important for models is it is for people. And I think taste in code and what's a clean solution that's not more complicated than it needs to be is actually going to become even more important as these models get better because it will be easier to write more and more code and so it will be more and more important to structure it in a tasteful way. That's a really good point on taste. Taste is kind of this thing that I feel like maybe some people are born with more taste than others, but generally you kind of develop taste through experience and learning what works and seeing failures and seeing successes. In a world where we're having AI write more and more of our code, there's been real pushback against some that say, oh, you're going to make programmers lazy or you're not going to give juniors a chance to learn what it actually looks like to work within a large code base and do all these things. How do you think about balancing this sort of automation or assistance in this case with also preserving the core engineering skills that maybe a software engineer has to go through? There's like trials and tribulations. I think these tools are very good educationally as well and they can help you become a great programmer. You know if you have a question about how something works if you want some concept, explain to you. Now you can just press command L and ask Claude, what is this, how does it work? You can explain it to me and I think that's very valuable. It does make it easier to write more code and do more stuff and that can result in higher and lower quality code being out there, that is true. But I think in general it's a very, very powerful tool that will raise the bar. I think quality comes very much from iterating quickly, making mistakes, figuring out why certain things failed and I think models vastly accelerate this iteration process and can actually through that make you learn more quickly what works and what doesn't. So I think in the long term it's a super helpful tool for developers just getting started and working on bigger and bigger projects and figuring out what works and what doesn't. Yeah, I think it'll be really interesting to see how programming evolves. I think you'll still for a very long time need to have the engineers that know the details, right? Can go into the weeds. I wonder how much you'll start to see people that are now learning programming who don't know many of the details but can still be fairly effective. I think today you still do need to know a lot of the details. I think over time you might have a class of software engineers that need to know very few of the low level details and it's still operated at a higher level and maybe looks a lot more like kind of thinking through like the taste is like more in kind of UX taste, right? Like what does like let's say you're trying to build something like a notion right at the end of the day, I don't think you can offload that entire thing to language model. You need to have the you need to kind of describe like okay when I do this type of interaction then I expect it to pop up in this particular way. Right maybe you don't have to get to the details of writing pure software that does that but still describing those interactions, describing the way this thing roughly works. That is a form of programming. Switching gears a little bit on the topic of models. So we just recently by the time this video comes out, Claude Opus 4 and Claude Sonnet 4 will be out into the world. Love to hear your guys thoughts on the new models and how you're starting to think about integrating them with Incarcer. I mean we've really enjoyed the new models. I think we were pretty shocked trying out the new Sonnet because I think 3.7 was a fantastic model. It was better at agente coding but everyone knew it kind of had these deficits right where it would maybe be a little bit too overeager. Like to do a lot. Yeah. Would like to change the test sometimes. Yeah. We found that Sonnet 4 has effectively fixed all of those and is much better. And then the intelligence has also been a big step up where you know you've seen other models that are kind of steps up in intelligence, maybe not as like strongest agente coding but like you know O3 is an example and we found it goes so to total with that despite being you know a much cheaper model. And so we're extremely excited for Opus because we think it'll be a fantastic agent to use in the background. Yeah. That's awesome to hear. The tests writing and over-eagerness things are things that we were trying to tackle pretty intensely with these models and concept of like reward hacking in which the models will find some way to basically take a shortcut to get to the final reward in RL. We've done a lot of work to cut that down. I think we cut it down by like 80% in these new models. I'm really curious to hear how did 3.5 Sonnet come about because that felt like the first kind of punch like this is like a really good coding model for Anthropic. How did it come about? We trained it. Just was good. Yeah, I think we have always known for a while that probably since the Genesis of the company that we wanted to make models really good at coding. It just seems important for everything else that you do especially as you make more models. 3.5 Sonnet was I mean I think 3 Opus was a really good coding model as well especially for its time but 3.5 Sonnet was the first time that we really put a strong dedicated effort to hey let's get these models good at coding but not just specifically coding this sort of longer horizon coding where it's having to do these things like you're mentioning earlier in the conversation around making edits on different files going off and like taking a command here, calling a tool and then going and making change somewhere else. That was the first model which we could kind of put all these things together and I think it just turned out really well and kind of set the stage for what our future models would be. And how do you guys think about code versus other areas where you want Sonnet to excel? Yeah, and Opus to excel. I mean code is one of the primary areas but I think it's not the only area. I think there is a good amount of transfer that you see from models getting really good at code to them just getting better at reasoning over taking many actions and working in this sort of agentic way and that carryovers is pretty nice as you're dealing with applications that might mix in code but also have to go retrieve knowledge from other places or do research. Generally we're about just pushing the frontier as much as we can with our models. Of course there is like considerations that we make around safety and making sure that the models are in line with what you as a user want and also what we believe the models should be doing but generally we want to keep pushing limits of what these models can do and kind of show developers in the world. This is what models are capable of. So things like computer use when we unveiled that back in October. That was like another direction in which we're really pushing forward in terms of how can a model be good actually navigating something that is primarily a human interface right. So it's not in the world of like APIs or tool calls or anything like that. It's literally just looking at an image as a human would then having to direct an action onto that screen. There's also a strong part to how we think about these models character as it's known now Amanda Askel is one of our lead researchers on this effort kind of crafting Claude's character. We put a lot of thought and consideration into what Claude should feel like and sound like and what does it mean for an AI to play a really prominent role in somebody's life not as just a coding agent but as kind of like their confidant in a sense and an entity that you're going to be spending a lot of time talking to. So that's also really fact in into all the decisions from make around these models and how we train them. How does Anthropic as a whole think about where things are going both in terms of self engineering and that in terms of like research like in terms of how much like these models will augment replace do a lot of this work. Yeah it can speak personally here. So personally I think that we're not going to be replacing as we've talked about earlier. There's just like so much more you can do now that you have like models that can do all this you know not symbols like typing of the code basically for you. I see this with myself too like I studied computer science in college and did software engineering and now I feel like I'm at the point where the models are like better at producing code than I am. Like if I were to just like think about doing like a leak code problem or anything like that where it's like a contained environment in the model of the right code it's going to like beat me and yet I feel like I can do more than ever. I can make prototypes of anything I can like spin up demos super super fast if I want to like show off a new concept it's felt very empowering in that sense then like taking away or dismissive of of what I've been doing and it is similar to where I feel like just because I have that knowledge of software engineering from the past I can actually exploit it much better and I can use the model I can push it farther than if I just still didn't have any idea about what code is. Maybe on that getting more into like the sort of fun future speculation. I want to ask like maybe a practical question in a few years we can come back to this one and see how we how we turned out January 1st 2027 so what is that little less than two years from now. What percentage of code do you think will be AI generated and following that what does the day in the life of somebody that's considering themselves to develop or now look like. I think it's similar to going back to let's say before I was born but you know 1995 and asking a lawyer in the future what percentage of legal documents will be word process are generated and the answer is 100% or you know close to 100% in that AI will be involved in almost all of the code that gets written but still your role as a lawyer or as a developer in understanding what the code needs to do and having taste and guiding what is done with the software is going to be more important than ever. I mean already a cursor it's probably 90% plus but that's because a large fraction of it is using more higher level features like occasioned and man can't what not but then a lot of it is you're typing and then tab will as you type do 70% of that right. So in the cases where you're actually going and doing it manually yourself tab is still doing most of those changes right. So the actual letters typed is like a very big help. Yeah but I think like every facet of producing software I think will be kind of changed to use AI in some way. Do you think we ever get to a world in which you basically have software on demand what does that look like. I think you're going to see people building software people in organizational functions building software who are not previously building software you know like people in sales who would not have built their own tools before will now be building for example dashboards to track what's important to them and going back to how taste becomes more important than ever you know now you can build the dashboard but you still need to decide what metrics the dashboard is going to show it doesn't prevent you from having to decide that. I think you're going to see many more people building their own software but it will be bottlenecked on having a unique thing that you want to do with the software that isn't properly served by existing needs. One example I like to tell people is we've got our our comms team who has actually been like shipping bug fixes to Claude ii which is just like absolutely insane like he's in a completely different part of the org. He's not such a product at all and yet he pops in with like a PR and he's like asking for a stamp and you're like what are you doing and it's like yeah he's using you know Claude Code or some coding tool with Claude as the base model there to like fix bugs in a production code base. I think that's amazing as well and it ties back into this like general hey if you have taste if you've good intuitions like you're just going to be able to do a lot. That's kind of how I see the world keeping progressing. I think things will change and like roles will look much different in five years ten years but generally like i'm very much in favor of like if you can do more with these things like that's generally always going to be a good thing. Yeah I feel like there are a lot of interesting paths that this could take. One is just completely on the fly on demand software where I am using my own version of some app and just like as I use it you know this interaction I don't really like and it just changes for me that's one kind of crazy future you can imagine or it's not even you kind of actively doing it but just based on your interactions with it the software whatever you're using changes to kind of fit you. That's like a cool potential path forward where I don't know if everyone in the world is going to want to like I don't know if the total size of like people want to kind of build their own software is like that large right. But adding the people who could benefit from software that kind of fits their needs is potentially the entire world. All right maybe one last thing to just kind of close us off here for all the people watching this if you're a talented engineer out there and you're thinking about making next move or you want to get more involved in the industry you're trying to decide between maybe going to a larger company or joining more of a task-based startup like a cursor and a topic. What would you tell someone in those shoes? Yeah I think start of having to advantage these days like with Anthropic and with cursor and getting like really excellent talent in a way that like when you're a bigger company a lot of people you know a lot of the best people in the world find that less exciting right and some people do and certainly like large companies have great people but the the density of that talent tends to be much lower and I get a start of you can get this really high talent density and that makes it really enjoyable to work with a bunch of other excellent colleagues you can work on really impactful things in this incredibly small team right building a product that kind of and building models that change the way that the world writes software and you can be a one of like you know tens hundreds or thousands of people working on that and that's really cool yeah that's great well thank you guys has been awesome conversation thank you.