- CloudSonic 4.5 is introduced as a top-performing coding model, demonstrating significant improvements in reasoning, math, and computer interaction, excelling on benchmarks like Sweepunch Verified and OS World.
- CloudCode receives substantial updates including a native VS Code extension for inline diffs, an enhanced terminal UI, and a crucial "checkpoints" feature for state management and instant rollbacks.
- The Claude API gains powerful new capabilities like "context editing" and a "memory tool" to enable agents to handle greater complexity, persist information, and run longer without manual intervention.
Claude Coded: Sonnet 4.5, Claude Code 2.0, and more.
- CloudSonic 4.5 leads on Sweepunch Verified with a score of 77.2% and can maintain focus on complex tasks for over 30 hours.
- The Claude for Chrome extension is now widely available, allowing direct interaction with Claude's enhanced capabilities.
- CloudCode's new VS Code extension provides a dedicated sidebar panel to display inline diffs of changes made by Claude, streamlining developer workflow.
- CloudCode's updated terminal UI (Bunk to Version 2.0) features improved status visibility and a searchable prompt history.
- The
checkpointsfeature in CloudCode allows users to instantly revert to a previous state of code, conversation, or both using/rewindor double-hitting escape. checkpointsonly apply to edits made by Claude; users are advised to combine this feature with their version control system for user-made changes.Context editingin the Claude API automatically clears stale tool calls and results from the context window, extending an agent's runtime when approaching token limits.- The
memory toolenables Claude agents to store and retrieve information outside the context window using a client-side, file-based system that persists across conversations. - The Claude App now supports data analysis, file creation (e.g., Excel, PowerPoint, Word, PDF), and visualization of insights using natural language prompts for paid plans.
Sweepunch Verified — A specific benchmark or test suite used to evaluate the performance and capabilities of coding models.
OS World — A test designed to measure an AI's ability to operate a computer effectively, similar to how a human would.
IDE — Integrated Development Environment; a software application that provides comprehensive facilities to computer programmers for software development.
inline diffs — A feature in code editors or IDEs that displays the differences or changes between two versions of code directly within the editor alongside the code.
checkpoints — A feature in CloudCode that allows saving the current state of code and conversation, enabling users to revert to a previous state if needed.
token limits — The maximum amount of text (measured in "tokens," which can be words or sub-word units) that an AI model can process or generate in a single input or output.
context window — The segment of an AI model's memory where current conversation, instructions, and recent data are held and processed; information outside this window is often not immediately accessible without specific tools.
client-side — Operations or data storage that occur directly on the user's computer or device, rather than on a remote server.
Claude Agent SDK — A Software Development Kit that provides tools, frameworks, and access to core services for building custom AI agents that leverage Claude's capabilities.
natural language — Human language (e.g., English) as spoken or written, rather than formal programming languages or specific commands.
Welcome to CloudCoded, your update on what's new with Claude and CloudCode. First up, CloudSonic 4.5 is not available wherever you get your Claude, and it is the best coding model in the world. It is leading on Sweepunch Verified with a score of 77.2%, and we have actually seen it stay focused on complex tasks for well over 30 hours straight. As a developer, this is really exciting, but it is not just code that has been improved. We have seen substantial gains on reasoning, math, and computer use as well. On OS World, which is a test to see how well an AI can actually use a computer like a human would, Claude jumped from 42% 4 months ago to over 61% now. And you can actually see this in action for yourself with our recently launched Claude for Chrome extension, which has been expanded to everybody that was on the waitlist. So go and give it a try today at claude.ai slash chrome. CloudCode got a ton of new improvements as well, starting with a native VS code extension that brings CloudCode directly into your IDE. This is perfect for developers like myself who prefer programming in an IDE over-determinal. With this extension, you can see clouds changes in real time through a dedicated sidebar panel that shows inline diffs of the changes made. This extension is currently in beta and you can get it in the VS code marketplace. We didn't forget about the terminal and a repressed terminal UI with the Bunk to Version 2.0 of CloudCode brings you updated interface features, improved status visibility, and a searchable prompt history as well. But the feature that I am most excited about is a new checkpoints feature that lets you confidently run large tasks and rollback instantly to a previous state if needed. You can use the slash rewind command or double hit the escape key to activate and can then choose to restore the code, the conversation, or both to a prior state. I do note that checkpoints only apply to edits made by Claude, not user edits or bash commands, so it is still recommended that you use this feature in combination with your version control system. We've also made an update to thinking where now you can enable or disable it with just a tap key, and the best part here is that it's going to save your preference across sessions. Finally, you can now track your usage in real time with the slash usage command. You can also do this in the Claude app by going to settings and then usage to view your data. On the Claude API front, we have two new capabilities enabling agents to handle even greater complexity. Context editing automatically clears stale tool calls and results from within the context window when approaching token limits. As your agent executes tasks and accumulates tool results, context editing removes stale content while preserving the conversation flow, effectively extending how long an agent can run without manual user intervention. The memory tool, on the other hand, enables Claude to store and consult information outside of the context window through a file-based system. Claude can create, read, update, and delete files in a dedicated memory directory stored in your infrastructure that is entirely client side and persists across conversations. It's kind of like having a CLAUDE.md file for your agent API. You can see an example of these capabilities in our Claude Place Cappan video, but we also have a couple of cookbooks created to show you how to leverage these new capabilities. Additionally, the Claude Agent SDK has been renamed from the Claude Code SDK and gives you access to the same core tools, context management systems, and permissions frameworks that power Claude Code to help you build your own agents. We have learned a ton over the last six months and put it all in this Claude Agent SDK for you to use. And finally, in the Claude App, Claude can now use code to analyze data, create files, and visualize insights in the files and formats that you are familiar with. You can prompt Claude using natural language, degenerate, Excel spreadsheets, create PowerPoint presentations, draft up Word documents, or create PDF files that you can instantly download and use. This capability is now available to all paid plans in preview. And that is it for Claude Coded. Happy coding and keep thinking.
TL;DR
- CloudSonic 4.5 is introduced as a top-performing coding model, demonstrating significant improvements in reasoning, math, and computer interaction, excelling on benchmarks like Sweepunch Verified and OS World.
- CloudCode receives substantial updates including a native VS Code extension for inline diffs, an enhanced terminal UI, and a crucial "checkpoints" feature for state management and instant rollbacks.
- The Claude API gains powerful new capabilities like "context editing" and a "memory tool" to enable agents to handle greater complexity, persist information, and run longer without manual intervention.
Takeaways
- CloudSonic 4.5 leads on Sweepunch Verified with a score of 77.2% and can maintain focus on complex tasks for over 30 hours.
- The Claude for Chrome extension is now widely available, allowing direct interaction with Claude's enhanced capabilities.
- CloudCode's new VS Code extension provides a dedicated sidebar panel to display inline diffs of changes made by Claude, streamlining developer workflow.
- CloudCode's updated terminal UI (Bunk to Version 2.0) features improved status visibility and a searchable prompt history.
- The
checkpointsfeature in CloudCode allows users to instantly revert to a previous state of code, conversation, or both using/rewindor double-hitting escape. checkpointsonly apply to edits made by Claude; users are advised to combine this feature with their version control system for user-made changes.Context editingin the Claude API automatically clears stale tool calls and results from the context window, extending an agent's runtime when approaching token limits.- The
memory toolenables Claude agents to store and retrieve information outside the context window using a client-side, file-based system that persists across conversations. - The Claude App now supports data analysis, file creation (e.g., Excel, PowerPoint, Word, PDF), and visualization of insights using natural language prompts for paid plans.
Vocabulary
Sweepunch Verified — A specific benchmark or test suite used to evaluate the performance and capabilities of coding models.
OS World — A test designed to measure an AI's ability to operate a computer effectively, similar to how a human would.
IDE — Integrated Development Environment; a software application that provides comprehensive facilities to computer programmers for software development.
inline diffs — A feature in code editors or IDEs that displays the differences or changes between two versions of code directly within the editor alongside the code.
checkpoints — A feature in CloudCode that allows saving the current state of code and conversation, enabling users to revert to a previous state if needed.
token limits — The maximum amount of text (measured in "tokens," which can be words or sub-word units) that an AI model can process or generate in a single input or output.
context window — The segment of an AI model's memory where current conversation, instructions, and recent data are held and processed; information outside this window is often not immediately accessible without specific tools.
client-side — Operations or data storage that occur directly on the user's computer or device, rather than on a remote server.
Claude Agent SDK — A Software Development Kit that provides tools, frameworks, and access to core services for building custom AI agents that leverage Claude's capabilities.
natural language — Human language (e.g., English) as spoken or written, rather than formal programming languages or specific commands.
Transcript
Welcome to CloudCoded, your update on what's new with Claude and CloudCode. First up, CloudSonic 4.5 is not available wherever you get your Claude, and it is the best coding model in the world. It is leading on Sweepunch Verified with a score of 77.2%, and we have actually seen it stay focused on complex tasks for well over 30 hours straight. As a developer, this is really exciting, but it is not just code that has been improved. We have seen substantial gains on reasoning, math, and computer use as well. On OS World, which is a test to see how well an AI can actually use a computer like a human would, Claude jumped from 42% 4 months ago to over 61% now. And you can actually see this in action for yourself with our recently launched Claude for Chrome extension, which has been expanded to everybody that was on the waitlist. So go and give it a try today at claude.ai slash chrome. CloudCode got a ton of new improvements as well, starting with a native VS code extension that brings CloudCode directly into your IDE. This is perfect for developers like myself who prefer programming in an IDE over-determinal. With this extension, you can see clouds changes in real time through a dedicated sidebar panel that shows inline diffs of the changes made. This extension is currently in beta and you can get it in the VS code marketplace. We didn't forget about the terminal and a repressed terminal UI with the Bunk to Version 2.0 of CloudCode brings you updated interface features, improved status visibility, and a searchable prompt history as well. But the feature that I am most excited about is a new checkpoints feature that lets you confidently run large tasks and rollback instantly to a previous state if needed. You can use the slash rewind command or double hit the escape key to activate and can then choose to restore the code, the conversation, or both to a prior state. I do note that checkpoints only apply to edits made by Claude, not user edits or bash commands, so it is still recommended that you use this feature in combination with your version control system. We've also made an update to thinking where now you can enable or disable it with just a tap key, and the best part here is that it's going to save your preference across sessions. Finally, you can now track your usage in real time with the slash usage command. You can also do this in the Claude app by going to settings and then usage to view your data. On the Claude API front, we have two new capabilities enabling agents to handle even greater complexity. Context editing automatically clears stale tool calls and results from within the context window when approaching token limits. As your agent executes tasks and accumulates tool results, context editing removes stale content while preserving the conversation flow, effectively extending how long an agent can run without manual user intervention. The memory tool, on the other hand, enables Claude to store and consult information outside of the context window through a file-based system. Claude can create, read, update, and delete files in a dedicated memory directory stored in your infrastructure that is entirely client side and persists across conversations. It's kind of like having a CLAUDE.md file for your agent API. You can see an example of these capabilities in our Claude Place Cappan video, but we also have a couple of cookbooks created to show you how to leverage these new capabilities. Additionally, the Claude Agent SDK has been renamed from the Claude Code SDK and gives you access to the same core tools, context management systems, and permissions frameworks that power Claude Code to help you build your own agents. We have learned a ton over the last six months and put it all in this Claude Agent SDK for you to use. And finally, in the Claude App, Claude can now use code to analyze data, create files, and visualize insights in the files and formats that you are familiar with. You can prompt Claude using natural language, degenerate, Excel spreadsheets, create PowerPoint presentations, draft up Word documents, or create PDF files that you can instantly download and use. This capability is now available to all paid plans in preview. And that is it for Claude Coded. Happy coding and keep thinking.