Skip to main content

Collaborative AI Engineering: One Dev, Two Dozen Agents, Zero Alignment — Maggie Appleton, GitHub

TL;DR

  • AI coding agents significantly accelerate individual implementation, but this speed highlights a critical new bottleneck: team alignment on what to build, not how to build it.
  • Current software development tools are not designed for agentic workflows, leading to wasted effort and coordination debt when alignment happens too late in the process.
  • The future of software development requires collaborative AI engineering tools that facilitate continuous, early alignment, shared context, and collective decision-making among humans and agents.

Takeaways

  • The promise of "one developer, two dozen agents" for extreme individual productivity is flawed because software development is a team sport that relies on communication and coordination.
  • Implementation is becoming a "solved problem" due to AI, making the hard question "should we build it?" and "what is the right thing to build?" rather than "how to build it."
  • Existing team coordination tools like GitHub, Slack, and Jira are ill-suited for agentic development, as they were designed for an outdated, slower process and struggle with the volume of AI-generated output.
  • The collapse of the implementation window means early planning and alignment touchpoints often disappear, shifting the entire burden of alignment onto pull requests, which is too late in the process.
  • Without proper alignment, teams face wasted work (features no one asked for, discarded efforts) and coordination debt (hairy merge conflicts, duplicated work, unreviewable PR stacks).
  • Tools like GitHub Next's ACE aim to solve this by creating a shared, multiplayer environment where planning, context gathering, decision-making, and development happen collaboratively and continuously.
  • This new approach integrates human-centric context (business goals, user research, political dynamics) that agents cannot discover independently, ensuring teams build the right things.
  • ACE leverages cloud-based micro VMs to provide isolated yet shared development environments, allowing teams to instantly jump into each other's work, including agent prompting history and live previews, without local setup friction.
  • By providing a social information fabric, future tools can make agents proactive in summarizing team activities, notifying about decisions, and pulling people into relevant conversations, helping manage the high volume of agentic work.

Vocabulary

Agentic tools — AI systems capable of performing autonomous actions or making decisions on behalf of a user, often for coding or development tasks. Collaborative AI engineering — An approach to software development that focuses on how human teams can work effectively with and orchestrate multiple AI agents in a shared, aligned environment. Bottleneck — A stage in a process that limits the overall capacity or speed, becoming the constraint on productivity. Opportunity cost — The value of the next best alternative that must be foregone when making a choice; in this context, building the wrong feature means missing the chance to build the right one. Alignment — The state where all team members share a common understanding of goals, priorities, and decisions, ensuring concerted effort towards shared objectives. Coordination debt — The accumulated cost and inefficiencies that arise from poor communication, lack of shared context, and uncoordinated work within a team. Micro VM — A lightweight, isolated virtual machine instance, typically cloud-based, used to provide a sandbox environment for specific tasks or sessions. Pull request (PR) — A mechanism in version control systems where a developer proposes changes to a codebase and requests review and merging from team members. Context gathering — The process of collecting all relevant background information, requirements, and implicit knowledge necessary to understand a problem or task. Prompt engineering — The practice of designing and refining the inputs given to an AI model to guide its behavior and generate desired outputs.

Transcript

So yes, this talk is called One Developer, two dozen agents zero alignment. This is the case for why we need collaborative AI engineering. So first, a very quick intro. I'm Maggie. I work at GitHub as a staff research engineer. At least that's my title. I'm actually a designer back when that was like a separate thing to engineer. And next is the lab team within GitHub. So we work on more experimental risky bets in the rest of the organization. We like to call it the department of fuck around and find out. And like everyone else, we are of course trying to shape new developer agentic tools. So I think this is what many people think peak developer productivity looks like right now. Right? This is like a wall of terminal based coding agents all running in parallel on one person's machine. I like to call this the one man two dozen chords theory of the future. So the promise that we're given here is that one person with a fleet of agents will do the work of an entire team of developers. The main problem with this dream is it assumes that software is made by one person. All of these tools are single player interfaces. And they focus on scaling up the work of the individual. But there is limited value in scaling up one individual. Because software is not made by one person in a vacuum. And there's a team sport and everyone building it needs to agree on what they're building and why. Believing individual productivity leads to great software is nine women make a baby in one month logic. More individual output doesn't solve problems that require communication and coordination. It makes them worse. An implementation is rapidly becoming a solve problem right probably everyone here believes that. Writing code is now fast. It's getting cheaper and quality is going up and to the right. The hard question is no longer how to build it. It should we build it. Agreeing on what to build is the new bottleneck. So everyone on your team needs to be involved in asking are we making the right thing? Are we spending our energy in the right place and how do we have the most impact? When production is cheap, opportunity cost becomes the real cost. You can't build everything and whatever you pick comes at the cost of everything else. Anyone who ships software on a team knows that this isn't a new problem. Alignment has always been a bottleneck but agents have made the cost of not being aligned as a team much, much higher. What makes it worse is that all our coordination tools are still from another era. So GitHub, Slack, Jira, Linear and the like are as they currently stand are not designed for the agentic development worlds. We are funneling masses of agentic outputs into platforms that were built for an outdated way of building software. I know I work at GitHub so that might sound heretical for me to say but I promise it's not controversial. There are very few people internally who believe that the PR and the issue are the future of software development and there are lots of us inside the machine trying to explore what comes next. So this is how the development process used to look. We had a planning phase, a building phase and a review phase. We had all of these touch points of alignment along the way and it was slow enough that we had time for conversations in Slack and Zoom meetings, comments on issues and draft PR so you could discuss the details and everyone could give their two cents and get advice through expertise across your team and seniors and catch mistakes and course correct if things were going wrong. But by the time the code was reviewed and merged the whole team had seen the work right happening and they were roughly on the same page. So that implementation window has now collapsed and because implementation is no longer as expensive in time consuming we think we don't need to plan as much. So most of those early touch points actually disappear. We know the review time for generated code is actually increased so that creates more points of alignment but they're actually on the wrong side of the implementation. The time between logging an issue and an agent opening a PR is now a couple of minutes. The code is so cheap that we don't properly stop to think before we prompt it. Unhelpfully, most coding agents also have this local plan mode that is completely unshared with other people. So you're not even checking with your team on whether the plan it made is good before you ship it if you even read it and so we lose even more alignment points. This leaves the weight of all that alignment to sit on the pull request. All those check points now come after the implementation at the end of the process when it's too late. And it's never what PRs were really designed to do in the first place so they perform poorly at it. None of our current tools give teams a shared space to discuss plans, gather the right context and work with agents as a collective. We're all experiencing the repercussions of this. Going fast without good alignment leads to wasted work so this is like features no one asked for and that don't actually solve real problems and receiving critical feedback after you finish something that ends up meaning you have to toss the whole thing out. And also coordination depth. This is when you get really hairy merge conflicts because agents will touch on the same files or developers even doing duplicated work because they both picked up a thing and tried to finish it in one day. Or as we all know, we all have giant stacks of PRs to review that nobody has any context for and don't even know what's in them. So how do we solve this? We need tools that help everyone on the team align before the agents start working not after. Not alignment needs to happen constantly alongside the implementation. Planning and building are no longer separate phases they are now a cycle. The tools of the future need to bring planning, context gathering and decision making and development underneath one roof. This is especially true because most of the context that you need for alignment and to build the right thing is not in the code base. It's in people's heads. The business context and the financial resources determine what the correct thing to build is. The political dynamics of who's in charge and who gets to make decisions. The product vision from leaders, the user research insight from designers and the organization's history on what you've built before. These all matter immensely when you're deciding what the right thing for your team to build is. And the agents can never discover this context on their own. You need a way to get humans to share it early and naturally without adding process and overhead. So all of this has been very clear to us on the next team. And we've been building a new research prototype that explores how we might solve some of these problems. It's called ACE, stands for agent collaboration environment. It's not a primetime product yet. So if it looks pretty rough around the edges, it's because it is. We're about to go into technical preview and we're going to use a test it with a few thousand people. Then we're going to learn how people collaborate and iterate from there. So here we are in ACE. It probably looks pretty familiar. We're not reinventing any more wheels than we have to. It looks a bit like FLAQ GitHub, Copilot, and a bunch of Claude computers had a baby. So we have our sessions list here on the left and sessions are why you do work, right? It's a multiplayer chat. It's like a Slack channel. I have teammates in here and I can talk to them about the work we're doing, but I also have my coding agents in here. Each session is more than a chat channel though. It is also backed by a micro VM, so a sandbox computer in the Claude on its own get branch. The changes we make in each session are isolated so we can work on parallel tasks and instantly switch between them. If I want to tap one of my teammates on the shoulder and get their thoughts on a feature I'm building, nobody has to stash their get changes and pull down a new branch or wrestle with local work trees. I just jump into their session and I see what they're doing in a click. This includes their entire prompting history with the agent. I have the context about how they got to the current outputs. Just like a local machine, I can run terminal commands in this session. Here I'm going to run bun install and bun death to get my project running. You want to see in a minute my live preview in the browser on the side is going to pop up when I open the port. The demo project we have here is a calm version of hack and news. It only shows you the top three stories from the last three months which is a bit more chill than every day. I'm going to ask the agent to change the color theme to purple here and you'll see in a second it instantly appears in my preview. It's just running the code. The agent has also made an automatic commit for me with a nice commit message and I can open the diffs and see the diffs all kind of standard things you would expect from coding agents. Let's say we want to do some real work. I have my teammates here in this session with me and I'm going to ask A's to add some additional color themes to my app. We're going to pick which model we want to use and obviously it's opus 4.6 and then A's is going to get started. We also have this handy summary block in the top right hand corner. This keeps me up to date with the latest changes in this session whether they're from me or someone else which means I can switch between lots of people's sessions that are running in parallel and always stay oriented about what's happening so that you don't get overwhelmed with the amount of noise and activity. But the more important thing is I want to talk to my teammates and I want to discuss what changes we're making. I can ask them what they think of the current changes. They can spin up the dev server themselves because remember we're all working on the same computer in the Claude. This is no problem. We can all see the same preview. We can all write terminal commands and see the shared outputs. No one is going to say this doesn't work on my machine. My teammates, Nate and Dan, they're jumping in here. They've taken some screenshots. They're suggesting some alternative features, asking questions. Now what we're about to see is that Nate is going to ask the ACE agent to make changes. In a minute, who has Nate? There's Nate. So he said ACE that's out of TL theme 2. So I actually kicked off this session but Nate is now prompting the agent. This is truly multiplayer. Both of us are sharing this coding session. The agent can also read our whole conversation. That is all input to the prompt. If we can talk about things up ahead and just say at ACE do it, they'll go do it. This kind of accessible Slack interface means that access to a coding agent brings everyone in who's creating software. Not just developers, but designers and PMs and customer support people can all be in the same conversation seeing what's happening in real time as a feature gets built. If you're thinking why wouldn't we just use Slack for this? I think it's because Slack is never going to become a fully featured software development tool, unless they sincerely pivot from that current business. It's never going to have the right primitives. I really doubt it's going to add them. Diffs and terminal commands and that sort of thing is not Slack's business. We wanted ACE because it's explicitly designed for software development, but it's much more welcoming to other team members than your terminal. Anyway, we're back to shipping our changes here. We like how this looks, so we're going to create a PR. Because eventually all this code does have to go back to GitHub, right? So we create this PR from directly inside ACE. We give it a minute and it's going to show us the preview of the PR. And then we can click a link that goes to it over here in the second. And then click. There we go. So there's our PR all works. This is backwards compatible. It has a link back to the ACE session within the description. People don't all have to be in ACE to use this. You could have a few members. We could have a team in ACE and the rest stay on whatever else they're using. And sometimes you still need to touch code. I do a lot of friend and agents are shit at CSS. They never do what I want. We can open our project in VS code here. And we have real time multiplayer editing because again, this is just a microvm Claude computer. Everyone's on the same computer. I can close my laptop on this and work and continue. My session doesn't die. My teammates can keep prompting ACE and making progress. We don't have a mobile interface yet, but we're building it. But this microvm architecture means that that will work seamlessly. I don't have to use my phone to somehow SSH into a terminal on my computer. Computer doesn't need to be alive. And I don't need to go buy a Mac mini to keep things available. I just talk to my always on agent in the Claude. For bigger, more complex features. You'll of course, when you're agent to write a plan, that's a very standard workflow at this point. So here we're chatting about adding variable time frames to our hack a news calm app. And then I've gone ahead and asked ACE to make a plan, which is going to do quite quickly. And so we can go open that plan, right? And here we are in our plan. I can see my teammates' curses. We can collaboratively edit it together. We can decide if we like this plan, if it's any good at all, if it achieves our intent. My teammate Nate here is making suggestions about maybe using a drop down for the interface, instead of a segmented control. And then it down's come in and updated the requirements. So the agent knows to do that. And once we're all happy with the details, we go back to the chat. And we can just say, at ACE, do this. And it knows what the context is. So I'm now going to jump over to our dashboard and ACE. A lot of the planning and discussion that would otherwise happen in Slack or GitHub or linear is now happening in our ACE sessions. So we have a lot of access to rich context on what work is underway and can helpfully summarize it for you. So here it's Monday morning, and I've been trying to remember what I left unfinished last Friday. And ACE is prompting me to keep working on some react hooks I was making as part of a big refactor, which is helpful since I have very crappy human memory after a long weekend. And from here, I can start a new session, or in this pick back up section, I can one click. I can open the session to keep going on my unmerged PR. I can also see a list of my recently completed PRs and issues to stroke my ego and make me feel productive. And on the right here, we have a team pulse section. So this summarizes what my co-workers have been up to for the last couple of days. I could see Nate has been shipping a lobby channel and David has been fixing access token issues. There's also a raw feed of recent issues and PRs in this repo. I personally find this summary much more helpful. One of the biggest challenges of agentic development is that the speed and volume of work makes it really hard to keep up with what your co-workers are doing. They are now shipping five features a day instead of half of one. This dashboard is our first pass at trying to make agents proactive in bringing that social context to you. If all your conversations around the code are available to agents, it gives them access to a social information fabric where they can help get you oriented every morning and stay aligned with your team. They could notify you about decisions being made or pull you into a conversation where someone is about to extend the feature that you originally built. So this is no longer a bunch of solo disconnected terminal instances on individual computers. This becomes a living, intelligent environment where everyone shares the same workspace and context. So all of this is actually about reclaiming time. Before coding agents came along, none of us had enough time and energy to make our products the way we wanted to. I guarantee everyone in this room has shipped software they're not proud of. Maybe you didn't have enough time to do user research or consider design details or think through the implications of your architecture choices. Not because you didn't want to but because there simply wasn't enough time because implementation took up so much of that time and effort. But we've been gifted a lot of that time back. We have an opportunity to not just go faster and build a giant pile of the same crappy software but instead to make much better software through much more rigorous critical thinking and better alignment in the planning stage. By doing more exploration, more research and thinking through problems more deeply than we could have before. Agents allow us to scale up ourselves and our teams in a way that if done right should lead to better quality software. I think many people are now realizing that in a world of fast cheap software, quality becomes the new differentiator. The bar is being set much higher and craftsmanship is what set you apart from vibe-coded slop. But craft still costs time and energy. It's not free. And in order to buy the time and energy you need for it, you need to do fewer things better which requires lots of strong alignment. There are also more distractions than ever. It's very easy to prompt your way to the wrong thing or to add lots of unnecessary, unhelpful features to your product. I think the dream for me is that we end up with tools, whether it's A's or others, that create environments where teams can think rigorously together about hard problems. Our authentic tools should help us do higher quality work, get aligned faster and build a few exceptional things rather than a thousand crappy ones. Thank you very much for listening. If you do want early access to ACE, we should have it out within a couple months at the very latest. This QR code will take you to a form where you put in your GitHub username and then will give you early access as soon as it comes out. You can read more about the GitHub next team and their research on GitHub next.com. And all my work and writing is on MaggieAppleton.com and I'll have the slides and notes for this up there in a day or two. Thanks.

Feedback / ReportSpotted an issue or have an improvement idea?