
Anthropic has just announced Claude Sonnet 4.5, calling it "the best coding model in the world" and their most powerful AI model to date. Released today on September 29, 2025, this latest addition to the Claude 4 family brings significant improvements in coding, agentic capabilities, and computer use—along with a suite of new developer tools that are already turning heads in the AI community.
Claude Sonnet 4.5 is available immediately via the API using the model string claude-sonnet-4-5-20250929
, with pricing remaining unchanged at $3 per million input tokens and $15 per million output tokens. This means existing users can upgrade without any budget adjustments while getting substantially better performance.
Watch Claude Sonnet 4.5 Build Software in Real-Time
Before diving into the technical details, see what makes Claude Sonnet 4.5 revolutionary. This 6-minute demo shows the "Imagine with Claude" feature generating a complete web application from scratch—no pre-written code, just AI building software as you watch.
In this demo: Building a book tracker app with real-time feature additions, testing Claude's ability to understand context and generate functional code instantly.
What Makes Claude Sonnet 4.5 Different From Previous Models
According to Anthropic, Claude Sonnet 4.5 achieves state-of-the-art performance on real-world software engineering tasks. On the SWE-bench Verified evaluation, which measures practical coding abilities, the model scores an impressive 77.2%—reaching 82% when using parallel test-time compute. For developers exploring advanced AI code assistants, this represents a significant leap in capability compared to other models on the market.
The model demonstrates what Anthropic describes as the ability to maintain focus for more than 30 hours on complex, multi-step tasks. This extended attention span is particularly relevant for autonomous agent applications that require sustained reasoning over long periods without losing context or making errors.
Key Improvement: Claude Sonnet 4.5 can work on complex coding projects continuously for over 30 hours while maintaining accuracy and focus—a crucial capability for enterprise-level development tasks.
Breakthrough Performance in Computer Use
One of the most notable improvements comes in computer use capabilities. On OSWorld, a benchmark that tests AI models on real-world computer tasks, Claude Sonnet 4.5 scores 61.4%—a substantial jump from Claude Sonnet 4's 42.2% achieved just four months ago. This 45% improvement demonstrates rapid progress in AI's ability to interact with software interfaces the way humans do.
This improved computer use capability is being deployed through Claude's Chrome extension, now available to Max plan subscribers. The extension allows Claude to work directly in the browser, navigating websites, filling spreadsheets, and completing complex tasks across different web applications—all while you maintain control over what actions the AI can take.
Performance Benchmarks: How Claude Sonnet 4.5 Stacks Up
Anthropic reports that Claude Sonnet 4.5 shows substantial gains across a wide range of industry-standard evaluations. These benchmarks provide insight into how the new AI model performs compared to competitors like GPT-5 and Gemini 2.5 Pro:
Coding and Development Performance
- Agentic coding (SWE-bench Verified): 77.2% (82% with parallel test-time compute)
- Agentic terminal coding (Terminal-Bench): 50.0%
These coding benchmarks are particularly impressive for developers working on complex software projects. The SWE-bench Verified test involves solving real GitHub issues, making it one of the most practical measures of an AI model's coding abilities.
Tool Use and Agent Capabilities
- Retail agent (τ2-bench): 86.2%
- Airline agent (τ2-bench): 70.0%
- Telecom agent (τ2-bench): 98.0%
The telecom agent score of 98% is particularly noteworthy, showing near-perfect performance in handling customer service scenarios for telecommunications companies. This suggests strong potential for customer support automation across various industries.
Reasoning and Problem Solving
- High school math competition (AIME 2025): 100% with Python tools, 87.0% without tools
- Graduate-level reasoning (GPQA Diamond): 83.4%
- Multilingual Q&A (MMMLU): 89.1%
- Visual reasoning (MMMU validation): 77.8%
- Computer use (OSWorld): 61.4%
The perfect score on high school math problems when using Python tools demonstrates Claude Sonnet 4.5's ability to leverage computational tools effectively—a critical skill for practical problem-solving applications.
Specialized Applications
- Financial analysis (Finance Agent): 55.3%
These benchmarks position Claude Sonnet 4.5 competitively against other frontier models including OpenAI's GPT-5 and Google's Gemini 2.5 Pro. The model particularly excels in coding tasks and specialized agent applications, making it a strong choice for AI-powered developer tools.
Introducing the Claude Agent SDK: Build Your Own AI Agents
Perhaps the most significant announcement alongside Claude Sonnet 4.5 is the release of the Claude Agent SDK—the same infrastructure that powers Claude Code, now available to all developers. This is a game-changer for teams wanting to build sophisticated AI agents without starting from scratch.
Anthropic spent over six months building Claude Code and solving fundamental challenges in agent design: memory management across long-running tasks, permission systems that balance autonomy with user control, and coordination between multiple subagents working toward shared goals. Now, all of this infrastructure is available as a ready-to-use SDK.
Pre-Built Agent Components Available
The Agent SDK includes pre-built components for common enterprise use cases:
- Code Security Agent: Automated vulnerability scanning and security analysis
- Code Review Agent: Intelligent code review with context-aware suggestions
- Contract Review Agent: Legal document analysis and risk identification
- Meeting Summary Agent: Automated meeting transcription and action item extraction
- Financial Reporting Agent: Data analysis and report generation for finance teams
- Email Automation Agent: Intelligent email classification, routing, and response drafting
- Invoice Processing Agent: Automated invoice data extraction and validation
For developers exploring the landscape of AI-powered developer tools, the Agent SDK provides enterprise-grade infrastructure without requiring months of custom development. You can start building production-ready AI agents in hours instead of weeks.
Claude Code Gets Major Updates Users Have Been Requesting
Claude Code, Anthropic's agentic coding tool, receives several highly-requested features that address common pain points developers have experienced:
Checkpoints: Save and Restore Your Work
Checkpoints allow you to save your progress and roll back instantly to previous states—one of the most requested features from the Claude Code community. This is crucial when experimenting with different approaches or when an autonomous agent goes down an unproductive path.
Refreshed Terminal Interface
A redesigned command-line experience provides more intuitive interaction with Claude Code. The new interface makes it easier to see what Claude is doing, review its actions, and intervene when necessary.
Native VS Code Extension
Direct integration with Visual Studio Code means you can now use Claude Code without leaving your preferred development environment. This seamless workflow integration was another top request from developers.
Enhanced Context Management
New capabilities for managing and editing context during long coding sessions help Claude maintain accuracy even on multi-hour projects. The context editing feature addresses one of the biggest challenges in agentic coding: keeping the AI focused on relevant information as projects grow in complexity.
Advanced Memory Tools
Enhanced memory management through the Claude API allows agents to run longer and handle greater complexity. These memory tools help Claude Code maintain consistency across extended development sessions.
These updates are available to all Claude Code users starting today, with no additional configuration required. If you're already using Claude Code, the improvements will roll out automatically.
New Capabilities in Claude Apps: Code and Files
The Claude web and mobile apps now include powerful new features that blur the line between conversation and creation:
- Code execution: Run code directly within conversations and see results in real-time
- File creation: Generate spreadsheets, slides, and documents without leaving the chat interface
- Enhanced workflows: Seamless integration of coding, analysis, and document creation in a single conversation
These features are available on all paid plans, including Pro, Team, and Enterprise tiers. The ability to execute code and create files directly in the chat interface makes Claude more practical for everyday work tasks, not just coding projects.
"Imagine with Claude" Research Preview: Watch AI Build Software in Real-Time
Anthropic is launching a temporary research preview called "Imagine with Claude"—available exclusively to Max subscribers for the next five days. This experimental feature demonstrates what's possible when you combine Claude Sonnet 4.5's capabilities with the right infrastructure.
In this experiment, Claude generates complete software applications on the fly with no predetermined functionality or prewritten code. Everything you see is Claude creating in real time, responding and adapting to your requests as you interact. It's a fascinating demonstration of autonomous software generation—and a glimpse into the future of software development.
The preview is accessible at claude.ai/imagine and will be available until October 4, 2025. If you're a Max subscriber, this is worth trying while it's available.
Safety and Alignment: Building Responsible AI
Anthropic emphasizes that Claude Sonnet 4.5 is "the most aligned frontier model we've ever released." As AI models become more powerful and autonomous, ensuring they behave safely and align with human values becomes increasingly critical.
Reduced Concerning Behaviors
The company reports large improvements in reducing concerning behaviors including:
- Sycophancy: Excessive agreement or telling users what they want to hear rather than what's accurate
- Deception: Providing misleading information or hiding relevant details
- Power-seeking behaviors: Attempting to expand capabilities or access beyond intended scope
- Encouragement of delusional thinking: Reinforcing unrealistic or harmful beliefs
Enhanced Security Against Prompt Injection
For agentic and computer use capabilities specifically, Anthropic has made considerable progress defending against prompt injection attacks—one of the most serious security risks for autonomous AI systems. Prompt injection occurs when malicious actors try to manipulate an AI agent's behavior by embedding commands in content the AI reads.
This is particularly important as Claude gains more ability to interact with computers, browse the web, and take actions on behalf of users. Strong defenses against prompt injection are essential for safe deployment of these capabilities.
AI Safety Level 3 (ASL3) Protections
The model is being released under Anthropic's AI Safety Level 3 (ASL3) protections, which include classifiers designed to detect potentially dangerous inputs and outputs related to chemical, biological, radiological, and nuclear (CBRN) weapons.
Anthropic notes they've reduced false positives from these safety classifiers by a factor of ten since they were first introduced, and by a factor of two since Claude Opus 4 was released in May 2025. While these classifiers may occasionally flag benign content, users can continue interrupted conversations with Claude Sonnet 4, which poses lower CBRN risk.
Integration with Existing Tools and Workflows
For users who have already set up Claude AI connectors with their workflow tools, Claude Sonnet 4.5 is a drop-in replacement that should work seamlessly with existing integrations while providing improved performance across the board.
The model maintains full compatibility with:
- Claude API endpoints: Direct API access for developers
- Amazon Bedrock: AWS's managed AI service
- Google Cloud's Vertex AI: Google Cloud Platform integration
- Existing Claude integrations: All current connectors and third-party tools
This backward compatibility means teams can upgrade to Claude Sonnet 4.5 without disrupting existing workflows or requiring code changes in most cases.
What Early Customers Are Saying About Claude Sonnet 4.5
Anthropic shared feedback from early enterprise customers who have been testing Claude Sonnet 4.5 in production environments:
"Claude Sonnet 4.5 reduced average vulnerability intake time for our Hai security agents by 44% while improving accuracy by 25%, helping us reduce risk for businesses with confidence."
— Nidhi Aggarwal, Chief Product Officer at Hai
A 44% reduction in vulnerability processing time combined with 25% better accuracy represents significant operational improvements for security teams. Faster identification and classification of security threats can make the difference between catching a vulnerability before it's exploited and dealing with a breach.
"Claude Sonnet 4.5 is state of the art on the most complex litigation tasks. For example, analyzing full briefing cycles and conducting research to synthesize excellent first drafts of an opinion for judges, or interrogating entire litigation records to create detailed summary judgment analysis."
— Pablo Arredondo, Vice President at CoCounsel
The legal industry has particularly high standards for accuracy and thoroughness. CoCounsel's endorsement suggests Claude Sonnet 4.5 can handle complex, high-stakes analysis where mistakes have serious consequences.
Domain experts in finance, law, medicine, and STEM reportedly found Claude Sonnet 4.5 shows dramatically better domain-specific knowledge and reasoning compared to older models, including Claude Opus 4.1. This improvement in specialized knowledge makes the model more practical for expert-level work across industries.
How to Access Claude Sonnet 4.5 Today
Claude Sonnet 4.5 is available immediately across all Anthropic platforms:
For Developers
- Use the model string
claude-sonnet-4-5-20250929
via the Claude API - Access through Amazon Bedrock for AWS integration
- Deploy via Google Cloud's Vertex AI for GCP users
- Pricing: $3 per million input tokens, $15 per million output tokens (unchanged from Claude Sonnet 4)
For End Users
- Available immediately on claude.ai
- Access through Claude mobile apps (iOS and Android)
- Desktop applications for Mac and Windows
- Available on Pro, Team, and Enterprise plans
For Claude Code Users
- Updates roll out automatically to all users
- No configuration changes required
- New features including checkpoints available immediately
Should You Upgrade to Claude Sonnet 4.5?
Anthropic recommends all users upgrade to Claude Sonnet 4.5, describing it as a drop-in replacement that provides "much improved performance for the same price" across all use cases. Based on the benchmarks and early customer feedback, the upgrade appears to offer substantial benefits with no downside:
- For developers: Significant improvements in coding assistance and code generation make this an obvious upgrade
- For enterprise teams: Better agent capabilities and the new Agent SDK provide infrastructure for building custom automation
- For researchers and analysts: Improved reasoning and domain knowledge enhance complex analysis tasks
- For existing Claude users: Same pricing with better performance means there's no reason not to upgrade
What's Next for Claude and Anthropic
The temporary "Imagine with Claude" research preview will be available for five days, giving Max subscribers a chance to explore the model's real-time software generation capabilities. This preview suggests Anthropic is exploring even more ambitious applications of AI in software development.
For developers interested in building autonomous agents, the Claude Agent SDK documentation is now available on the Anthropic developer platform, along with detailed technical specifications and implementation guides. The SDK represents a significant investment in making advanced AI capabilities accessible to a broader developer community.
Complete technical details, evaluation methodologies, and safety assessments are available in the Claude Sonnet 4.5 system card on Anthropic's website. For teams considering enterprise deployment, the system card provides comprehensive information about the model's capabilities, limitations, and safety measures.
The Bottom Line
Claude Sonnet 4.5 represents a significant step forward in AI capabilities, particularly for coding and autonomous agent applications. With the same pricing as its predecessor but substantially better performance, plus new tools like the Agent SDK and improved Claude Code features, this release gives developers and enterprises new ways to leverage AI effectively.
The emphasis on safety and alignment, combined with real-world performance improvements, suggests Anthropic is delivering on its promise to build powerful AI systems that remain reliable and trustworthy as they become more capable.
Frequently Asked Questions About Claude Sonnet 4.5
Claude Sonnet 4.5 is Anthropic's latest and most powerful AI model, released on September 29, 2025. It's being called the best coding model in the world, with state-of-the-art performance on software engineering benchmarks (77.2% on SWE-bench Verified), improved computer use capabilities (61.4% on OSWorld), and the ability to maintain focus on complex tasks for over 30 hours.
Claude Sonnet 4.5 maintains the same pricing as Claude Sonnet 4: $3 per million input tokens and $15 per million output tokens. This means users get substantially better performance at the same price point, making it an easy upgrade decision for existing users.
Claude Sonnet 4.5 is available immediately through multiple channels: via the Claude API using model string claude-sonnet-4-5-20250929
, through Amazon Bedrock and Google Cloud's Vertex AI for enterprise users, on claude.ai for web users, and through Claude mobile and desktop apps for Pro, Team, and Enterprise plan subscribers.
The Claude Agent SDK is the same infrastructure that powers Claude Code, now available to all developers. It provides enterprise-grade tools for building AI agents, including pre-built components for code security, code review, contract review, meeting summaries, financial reporting, email automation, and invoice processing. The SDK solves complex challenges like memory management, permission systems, and multi-agent coordination.
According to Anthropic's benchmarks, Claude Sonnet 4.5 leads in several key areas: it scores 77.2% on SWE-bench Verified for coding (vs GPT-5's 72.8%), achieves 61.4% on OSWorld for computer use (GPT-5 data not available), and scores 98% on telecom agent tasks (vs GPT-5's 96.7%). The model is particularly strong in agentic coding, tool use, and sustained reasoning over long tasks.
Claude Code now includes checkpoints for saving and restoring progress, a refreshed terminal interface for better interaction, a native VS Code extension for seamless integration, enhanced context management for long coding sessions, and advanced memory tools that allow agents to handle greater complexity. All updates are available immediately to all Claude Code users.
Imagine with Claude is a temporary research preview available exclusively to Max subscribers for five days (until October 4, 2025). It demonstrates Claude Sonnet 4.5 generating complete software applications in real-time with no predetermined functionality or prewritten code. Users can access it at claude.ai/imagine to see autonomous software generation in action.
Yes. Anthropic describes Claude Sonnet 4.5 as their most aligned frontier model yet, with large improvements in reducing concerning behaviors like sycophancy, deception, and power-seeking. It's released under AI Safety Level 3 (ASL3) protections with classifiers to detect dangerous inputs/outputs. The model also has enhanced defenses against prompt injection attacks, crucial for safe autonomous operation.
Yes. Claude Sonnet 4.5 is a drop-in replacement that maintains full compatibility with existing Claude API endpoints, Amazon Bedrock, Google Cloud's Vertex AI, and all current Claude connectors and third-party integrations. Teams can upgrade without disrupting workflows or requiring code changes in most cases.
Claude Sonnet 4.5 achieves 77.2% on SWE-bench Verified (82% with parallel test-time compute), making it the best coding model available. It can maintain focus for over 30 hours on complex projects, scores 50% on Terminal-Bench for agentic terminal coding, and shows dramatically improved performance on real GitHub issues. The model excels at both writing new code and debugging existing codebases.
No. Claude Sonnet 4.5 is designed to augment and assist developers, not replace them. While it has impressive coding capabilities, it works best as a collaborative tool that handles repetitive tasks, generates boilerplate code, helps with debugging, and assists with code review. Human developers are still essential for architecture decisions, requirements gathering, quality assurance, and creative problem-solving.
Claude Sonnet 4.5 supports all major programming languages including Python, JavaScript, TypeScript, Java, C++, C#, Go, Rust, PHP, Ruby, Swift, Kotlin, and many others. The model has particularly strong performance in Python and JavaScript based on benchmark results. It can also work with frameworks, libraries, and multiple languages within the same project.
The "Imagine with Claude" research preview is available for five days, from September 29 until October 4, 2025. It's exclusive to Max plan subscribers during this period. Anthropic hasn't announced whether this feature will become permanent or return in the future, so Max subscribers should try it while it's available.
No. Claude Sonnet 4.5 requires an internet connection to function as it runs on Anthropic's cloud infrastructure. All processing happens on Anthropic's servers, which ensures you always have access to the latest model version and capabilities. For enterprise users with specific security requirements, Anthropic offers deployment options through Amazon Bedrock and Google Cloud Vertex AI.
While Anthropic hasn't specified the exact context window in the announcement, the documentation mentions configurations of 200K and 1M tokens. The model's ability to maintain focus for over 30 hours on complex tasks suggests it has substantial context handling capabilities. Developers should refer to Anthropic's official API documentation for specific context window limits and best practices.
This announcement was published on September 29, 2025. Performance claims and benchmarks are as reported by Anthropic. Real-world performance may vary based on specific use cases and implementation. We'll be conducting our own testing and will share detailed hands-on reviews as we gain more experience with Claude Sonnet 4.5.