Skip to main content

Platforms for Humans and Machines: Engineering for the Age of Agents — Juan Herreros Elorza

TL;DR

  • AI agents highlight existing friction points in software development, making the need for established best practices like self-service and API-first platforms more urgent and visible.
  • Building platforms that are self-service, API-based, and local-first is crucial for empowering both human developers and AI agents to work efficiently.
  • Structured documentation, observable systems, and a culture of contribution are essential for maximizing the productivity of AI agents and ensuring platform success.

Takeaways

  • Establish a platform engineering team to abstract underlying infrastructure complexity, allowing other teams to focus on building their core applications.
  • Design platforms to be self-service, intuitive, and automated, removing human bottlenecks and enabling both developers and AI agents to provision resources independently.
  • Implement API-based platforms to facilitate programmatic interaction, offering discoverability, schema validation, and secure authentication for agents.
  • Embrace "shift left" principles by enabling local validation and early failure detection, preventing agents from pushing code that will fail later in the deployment pipeline.
  • Ensure observability systems (logs, metrics, traces) are API-driven, allowing AI agents to programmatically verify task completion and system health.
  • Maintain structured, accessible documentation, including agent-specific instructions (e.g., agents.md or skills), to guide agent behavior and task execution.
  • Encourage contributions to internal developer platforms, using guardrails and documentation to ensure adherence to standards and maintainability.
  • Measure the impact of platform changes using metrics like DORA, reliability, and support requests to evaluate effectiveness and demonstrate value.
  • Leverage the current enthusiasm for AI as an opportunity to implement long-standing software development best practices that might have previously faced resistance.

Vocabulary

platform engineering — A discipline focused on building and maintaining internal developer platforms that provide tools and services for other development teams. Kubernetes — An open-source system for automating deployment, scaling, and management of containerized applications. blob storage — A service for storing large amounts of unstructured data, such as images, videos, or documents, without regard to its format. secret management systems — Tools or services designed to securely store, retrieve, and manage sensitive information like API keys, passwords, and certificates. observability — The ability to understand the internal state of a system by examining its external outputs (logs, metrics, traces), crucial for monitoring and debugging. shift left — A software development practice that advocates for performing testing and quality assurance activities as early as possible in the development lifecycle. API (Application Programming Interface) — A set of defined rules and protocols that enable different software applications to communicate and interact with each other. CLI (Command Line Interface) — A text-based user interface used to interact with computer programs or operating systems by typing commands. DORA metrics — A set of four key metrics (Deployment Frequency, Lead Time for Changes, Mean Time to Recovery, Change Failure Rate) used to measure software delivery performance. AI agent — An autonomous software entity powered by artificial intelligence, designed to perform specific tasks or achieve goals, often by interacting with other systems.

Transcript

Hello everyone, thank you very much for watching and welcome to my talk platforms for humans and machines. My name is Juan and I work as the team lead for the Claude Native Technology team in a company called Banking Circle. You probably don't know about us so let me tell you just a bit about what we do. Mainly we are a global cross-border payments provider on top of accounts and liquidity management. We process over one trillion euro per year and we provide banking services to over 700 regulated financial institutions. We are very much a fintech and in fact a lot of us are working within some capacity of engineering. Because of that, some years ago we decided to establish a platform engineering team in Banking Circle, which is where I'm working. More than 250 people in Banking Circle are building some type of technical systems. And this includes the APIs that our clients are using, this includes the core banking systems, some internal tools, this includes data science and data engineering, and of course the integration with the clearing schemes where we set all the payments. Because of that, we have decided to establish a platform that all of these teams can use so that we have struck the way the complexity in some of the underlying infrastructure and Claude concerns. And so these people can focus on building these systems that they are supposed to build. We call that platform Atlas. And Atlas has a number of subplatforms. We have a platform for compute where people run their applications based on Kubernetes. We have a platform for infrastructure which people can use to provision blob storage services. They can also use it to provision databases. They can use it to provision secret management systems. We have platforms for messaging for the different applications and the different systems to communicate with each other. And we have platforms for observability so that we can know what is happening with all of our applications and all of the payments that we are processing. Now, we have come a long way in this platform engineering journey, but it wasn't always easy. And I would like to start my talk with a bit of a story using fictional names and characters, but a story based on my experiences while working here in banking circle. I'm sure many of you will relate to this story as well. Let's say that we have a new developer, right? And they have joined the company. They have joined a team. They start working and they build their first application. This developer is great. They are very good at coding the application, perhaps for some payment system. But eventually they hit a wall. They have the code, but they need to deploy that application somewhere. Then the developer will naturally ask someone in the team, hey, what should I do with this application? How do I deploy it? And then the person in the team might say to the developer, well, you know what? Actually, I did that, but it's already said in this pipeline. Why don't you copy it from there? The developer might go copy the pipeline. Then maybe they have to adjust something. It wasn't exactly the same in this case because the application had some specific requirements. Then they will try to just run this pipeline. Maybe the pipeline will fail. And ultimately the developer doesn't know why it's failing because the error has nothing to do with the application that they coded. Maybe they then go back to their teammate and they are like, hey, what can I do with this? And the teammate might say, you know what you should do? You should talk to this person that is working in the infrastructure team. They will help you. Now this developer will go, maybe they will talk with this person. Eventually they will say, oh yeah, of course this is an error that happens very often. I can solve it just for you. The error will be solved and the developer will deploy the application. And then maybe that deployment will succeed. Only to realize at the end of it that actually on top of that application, the developer also needed a database or maybe some block storage. Back to SquareWen, the developer will go to maybe this person in the infrastructure team to ask, hey, could I use you to create this database or these block storage for me? And maybe the person, let's assume best intentions, of course. Maybe they want to help. But they will say, you know what? I actually have a lot of other things that I'm working on. I will help you. But that's going to be next week. So obviously the developer will get frustrated because they kind of had done their part. But then they struggled a lot in this process to then deploy the application and to get it running with all of the dependencies that it needs. Perhaps they were following some pieces of documentation, something that someone wrote along the way. But then they were also relying on asking this team mate and asking the person in the other team. Now, this is a bad situation that again, I think all of us have been at some point. It's a bad situation for a person. But today, all of us are using LLMs. We're using AI agents to help us in our daily job. And if this situation was tricky for a developer, this situation is essentially impossible for a machine because the machine is not going to, you know, go and try the pipeline and then go up to the second floor and talk to the person in that other team. Now, of course, an agent could use teams or maybe even use something from some voice model and call over the phone. But generally, a coding agent is not going to be able to do all of these things. So these pain points that the developers were facing suddenly become much more obvious when an agent is facing them. And they are a limiting factor in how productive this coding agent can be or how productive the developer can be when using the coding agent. And what I'm about to tell you, I'm going to address some of the points in this story. But the gist of it is that best practices are still best practices. We have known for a while that some of these things, like relying on a team, make to tell us how to deploy an application or having to reach out to a person in a different team or having to wait for a pipeline only to realize that it wasn't exactly what we needed. They were never good. They are just much more obvious and perhaps much more painful now that we have this coding agents working next to us. So what can we do about it? Well, the first thing to me, it has to be self service. If I need any resources from this platform, if I need to be able to do anything through these platforms, I should be able to do it in my own. Similarly, if my agent needs to be able to do anything on the platform, it should be able to do it on its own. There should be no process that requires talking to a specific person or waiting for a specific person to do something. The agent should be able to trigger everything it needs. And for that, of course, it's also important that the self service flow or the self service process is intuitive. Of course, we need to document how these things work, and I will get to that in just a moment. But the easier that we make it, the more self service it actually is. Because if it is technically self service, but it requires fetching some building blocks from five different places and putting them together and then triggering a flow somewhere else, then it's not really self service. So make it automatic, remove people from the process, and make it easy to the greatest extent possible. The second point that I think it's important is make it API based. Self service could look in many different ways, and it could also be something based on sending some text somewhere. It could be based on clicking a button. It could be based on many things, but agents argued at calling well-defined APIs. It could also be, of course, a CLI on top of the API. It could be something like an MCP server around the API that the agents uses. All of those are also good ideas. But generally, under all of that, you should have a well-defined API. This is discoverable. So as it interacts with it, the agent is going to discover what it can do and the options that are there. It has scheme validation. So naturally, the agent will only send things that are going to work. It also has authenticate or it can have, depending on what you're building, authentication and authorization in place. So because of that, your agent will be allowed to use your credentials as a developer, or maybe if the agent is in a particular flow, it will have its own. But it will be able to do all of these things in a secure way, and it will know what is doing. And with this, the agent can go back and forth. It can try to do something. The API will get a response, and the agent can start working in this way where it goes in a loop until it gets what it needs or what it was asked to do. Which brings me to my next point. An agent is typically running in your machine. You could also have it on some server somewhere, and you might have open-close or something and you're communicating with an agent there. But typically, an agent is local to the machine. Of course, the models are running somewhere else, but the work the agent is doing is local. So make it easy for the agent to do that. First of all, shift left. If something is going to fail, it should fail as soon as possible. So don't make the agent push something to your version control system only to then fail on some workflow after a few minutes. If you can validate things upfront, if you can run them just locally, again by calling those APIs, or maybe using some type of wrapper around them, do that. Shift left as much as possible. Then like I said, the agents are going to go on a loop. So if the agent is in your machine and it can do everything it needs there, and you clearly define this is what I need, the agent is going to iterate until it gets there. Now it is important that you give it precise instructions that you describe the task that you tell the agent this is what I need you to do. And it is also important that you tell the agent this is how you know you have succeeded at the task. This is a bit important when working with agents because as humans we could verify this in different ways. And we have a lot of observability systems where perhaps we would like to check if the application has been deployed and the metrics are looking fine. Maybe we will be looking at some dashboards. The agents are not going to be looking at those graphical interfaces. So you also need to think how does observability look like if the prime user is going to be an agent. Make those logs, metrics, traces, everything that can help available via an API or via CLI or an MCP server or something like that. By doing that, you are letting the agent close the loop. You are telling it how to do things and how to verify that the things have been done correctly. And since I am speaking about telling the agent how to do things, something that is critically important is documentation. Of course you already have documentation. I think many of us have written documentation, have put it somewhere and then were unable to find it. When we need to expose this documentation to agents but then again this was already true for humans, we need to be structured around it. So there are different strategies that you might want to take. One of them could be, especially if you are working in smaller repositories, keep your documentation next to the code it is documenting. That way if an agent is working in that particular folder or repository, it has everything it needs. It has the code, it needs to work on and the documentation that describes it. If you are working on something bigger or perhaps a platform team, if you need to expose all of the documentation about the platform that an agent might need, the better idea is to put it in a centralized place so that the agent can go there and start discovering which the documentation is available. Now once again, think API first because the agent is going to be much better at consuming the website by doing that and specifically if you can give the specific bits of documentation that it needs to be API even better, rather than getting the entire HTML page in memory and trying to figure out what is the relevant bits there. Of course, when we talk documentation, we can also think about an agent specific documentation. Now you assume that many of you are already familiar with this but you can use the agent dot MD files or clot dot MD copilot instructions, dot MD depending on your agent of choice. By doing that, you can also describe the agent how it should work in a particular repository. You can tell it well, you should always build in this way, testing this way, deploying this way. You can verify in this way, you can include all of that in your agents dot MD. You can also have one of these more generally apply into different systems and then add one and agents dot MD more specific to a particular project or repository on top of that. You can also use skills if you have some conventions that you're following or some well-defined way of interacting with some of your platforms. You can codify those in a skill, which is again just a marathon document. And by doing that, you're telling the agent, when you do this type of task, you should do it like that. Last but not least, you should also encourage contributions in your platforms. If you're building internal developer platforms, those are going to serve the developers in your organization. And you want them contributing because that way they can also help you, they can help you help them. So you should encourage them, you should welcome contributions and because they are using a agents, the entry barrier is going to be lower. And so I will expect and that's what I have seen that people are more welcome to contribute to the platforms. Now, of course, this is a level, a double its sort because ultimately as the person or the team owning the platform, you are responsible for its maintenance. So you should think a lot about which things should be taken into consideration when contributing to the platform. You need to have some guard rails thinking about security or compliance or just following a well-defined set of standards that then helps you maintain the platform. If I veered you of always following those conventions, you can do this with some policies, perhaps in your systems, but you can also yes, rely on giving context to the agent. Once again, you could use agents.md, you could use some skills so that when people want to contribute on your platform, they can also point their agents to the smartphone files and refer to those as documentation on how to contribute. Generally, I would encourage a combination of the two have guard rails in place for everything that you absolutely don't want to happen. Use some sort of policies for that, but then on top of that use the smart down files to help the agents work in the way that you want them to. And then we get to a very important question, which is, okay, we have done all of these things, right. People are using AI in the organization. They are following now these practices that we have been recommending. We have built a platform that can be used by AI, but did it work. And I think the way of knowing if it works is by measuring whether it worked or not. You can of course measure these things before and after making some changes in your platform so that you can see whether they had an effect or not. And depending on what you want to do, you might want to focus on some type of metrics or another. Now, we know that there is metrics about application delivery and we have the whole Dora metrics on that. Chains frequency, meantime, till recovery, lead time, and another one that I can't really remember right now. But those metrics are measuring how often your developers are able to release and how good those releases typically go. You can also measure reliability, perhaps that's the main concern. It certainly is in FinTech institutions. So you can check, okay, are my applications more reliant than they were before? Am I having less errors than I was having before? Is traffic performing in any different way? Or you might want to look at more platform specific metrics such as how many support requests have I got? If people and their agents can do everything on their own because it's self-service, am I then not needed to support them? Or am I now having more support requests because the way in which I implemented this is confusing? You can also use some other frameworks about developer satisfaction and developer experience such as the space one. But my general point is think about what you want to achieve with this making it easier for a AI agents to contribute and then measure whether you actually succeeded at that or not. And with that, I'll just summarize my advice that once again comes from my experience. There could be more points in this list, but I think this is a good point where to get it started. If you want to make your platform ready for AI agents so that you can get the most out of them, make sure your platform is self-service, apa-based and local first. Make sure you have good documentation and observability in that platform so that it's easy to tell the AI and what to do, how to do it and how to verify it's done. And encourage welcome contributions to your platform because that way you can also move faster and implement the features that your users need. Last but not least, measure that all of these indeed worked. And maybe one extra piece of advice, maybe your organization has been resisting some of these best practices. Maybe you have been trying to push for these API first platforms or to make them more local friendly or to have the time to write proper documentation. But then you have gotten some resistance to that because people were focused on other priorities. Take advantage. Everyone from the executive level to the individual contributors are looking at AI now. It is a very hot topic. So you can use AI as the excuse to implement some best practices that again were always best practices if you didn't have the chance to do it until now. Thank you very much for listening. My name is Juan Herrera Salorza. Here you have links to my LinkedIn, my personal website and my GitHub. And if you've liked it, please connect over there. Thank you and have a great AI engineered Europe event.

Feedback / ReportSpotted an issue or have an improvement idea?