Skip to main content

What is Claude Managed Agents?

TL;DR

  • Claude Managed Agents is an API suite designed for building, deploying, and scaling AI agents within isolated, configurable environments.
  • Agents can perform complex, iterative tasks, such as website optimization or financial reporting, utilizing various tools and external services while operating in parallel.
  • The platform supports advanced features like persistent memory, multi-agent coordination, and configurable permissions, enabling robust and stateful AI-driven workflows.

Takeaways

  • Define Agent Capabilities: Configure agents with specific tools, personas, and capabilities, then set up sandbox environments with required packages and network controls.
  • Isolate Workflows: Fire off agent sessions from your application, allowing Claude to perform tasks inside isolated containers with full file system access, bash execution, and web search.
  • Iterate with Rubrics and Graders: Define performance rubrics (e.g., Lighthouse score > 90) and use a separate 'grader' agent to evaluate outputs, enabling agents to receive feedback, fix issues, and resubmit.
  • Enable Parallel Processing: Run multiple independent agent sessions concurrently, allowing different tasks or tickets to be processed simultaneously in their own isolated containers.
  • Integrate External Tools and Services: Agents can leverage pre-installed tools (e.g., Lighthouse, Puppeteer), execute custom code (e.g., Python for analysis), and interact with external platforms (e.g., Slack, Asana) via MCP servers.
  • Implement State with Memory Stores: Provide agents with a memory store to check past findings, track changes, and retain context, enabling stateful interactions and more intelligent future actions (e.g., flagging repeat issues).
  • Orchestrate Multi-Agent Coordination: Design complex workflows using a coordinator agent that delegates tasks to multiple specialist agents, who share a file system and report back findings for synthesis.
  • Enforce Permissions and Policies: Integrate permission policies that can halt agent outputs (e.g., a draft Slack message) for human review and approval before they are published externally.

Vocabulary

Claude Managed Agents — An API suite for building, deploying, and managing AI agents at scale within isolated execution environments. Agent — An autonomous program configured with specific tools, personas, and capabilities to perform defined tasks. Sandbox Environment — An isolated, configurable execution environment for an agent, with specific packages, network controls, and mounted file system access. Session — A single, isolated execution instance of an agent performing a specific task within its configured environment. Rubric — A set of criteria or guidelines used to evaluate an agent's output or performance against predefined standards. Grader — A separate agent or component responsible for evaluating the output of another agent against a predefined rubric, providing feedback. Event Stream — A real-time data flow that transmits updates, such as tool calls or progress, from an agent session back to the initiating application. MCP Servers — (Managed Cloud Platform Servers or similar) Services that enable agents to interact with external applications like Slack or Asana. Memory Store — A persistent storage mechanism that allows agents to retain and retrieve information from past interactions, enabling stateful behavior and contextual understanding. Multi-agent Coordination — A system design where multiple agents collaborate, often with a coordinator agent delegating tasks to specialist agents, to achieve a larger goal.

Transcript

Claude Managed Agents is a suite of APIs for building and deploying agents at scale. You define agents with specific tools, personas, and capabilities. You configure sandbox environments with the right packages and network controls. You fire off sessions from your own application, and then Claude does the work inside an isolated container with full file system access, bash execution, and web search. I have over here a Kanban board sitting on top of Managed Agents. I drag one over to the in progress, and then that fires off a session automatically. Now, the ticket says optimize website performance. So, my back end creates a session. It points to an environment that I configured with Lighthouse and Puppeteer pre-installed, and mount my GitHub repo into that container. Claude has the code base, the tools, and a rubric. Lighthouse score above 90, no render-blocking resources, all images lazy loaded. And then we can see here that Claude runs the audit. It starts compressing images, inlining CSS, deferring scripts. Every tool call streams back to the board in real time through the event stream. So, the rubric kicks in. A separate grader running at its own context window evaluates the output against my criteria. Claude reads that feedback, goes back in, fixes what it misses, and then resubmits. Good. We're up to 96. And note that I can drag a second ticket over while the first is still running. Two sessions, two containers, two separate tasks running in parallel. So, I have another agent here that's job is to track prices and plan changes across every SaaS tool that our company pays for, and have a report ready before stand-up. Common. Claude searches the web for current pricing pages, checks for plan tier changes, flags new features that might affect your contracts. It then runs a cost analysis in Python inside of that sandbox. And then it also uses an Excel spreadsheet skill and writes an executive summary. And when the report is ready, Claude posts a link to Slack and creates a review task in Asana, both through MCP servers. The agent also reads from and writes to a memory store. Before it starts, it checks what it found last week. After it finishes, it stores what's changed. So, next Monday's report says, "Claude compute 15% lower since last week." instead of listing the same static pricing data. I have an alert here that fired from my monitoring stack. A custom tool my back end receives the alert payload and sends it into a new session as a tool result. This session uses multi-agent coordination. A coordinator agent receives the alert and delegates to three specialists, each running in their own context window on the same shared file system. The specialists report back. The coordinator synthesizes their findings into a single incident summary. And before it posts the update to Slack, the permissions policy fires. And so I see the draft on screen, approve it, and the message goes out. Memory ties all of this together. The coordinator checks past incidents in the memory store and flags a pattern. This looks like the DNS resolution issue from 2 weeks ago that was caused by a misconfigured TTL. So, that means that the next time a similar alert fires, the agent starts with that context instead of diagnosing from scratch. Managed agents gives developers the tools to deliver a fully managed stateful agent experience with agents, sessions, environments, tools, MCP, memory, outcomes, and multi-agent coordination. You define what done looks like. Claude works until it gets there.

Feedback / ReportSpotted an issue or have an improvement idea?