Anthropic’s Week: Routines, Opus 4.7, and Claude Design — The Full-Stack Pivot
tags: [anthropic, claude, ai, agents, coding]
Three product launches in four days. Anthropic stopped being a foundation model lab this week.
🚀 What Happened
Between April 13 and 18, 2026, Anthropic shipped three product announcements in rapid succession — each landing on a different layer of the developer stack:
| Date | Layer | Announcement |
|---|---|---|
| April 14 | Automation infrastructure | Claude Code Routines + desktop redesign |
| April 14 | Governance | Vas Narasimhan joins Long-Term Benefit Trust Board |
| April 16 | Foundation model | Claude Opus 4.7 |
| April 17 | End-user product | Claude Design |
And in the tail of the previous week — April 7 — Project Glasswing and the reveal of Claude Mythos Preview, the closed frontier model. That announcement explains why the public-facing releases look like they do.
A model, an automation platform, and a consumer-facing product — all shipped within 96 hours. This isn’t a release cadence. This is a strategic pivot.
⚙️ Claude Code Routines — The Quietest Big Deal
Shipped April 14 in research preview. This is the release most commentary is missing, because it doesn’t have Figma losing 7% attached to it. But if you build agents for production, Routines are more important than Opus 4.7.
A routine is a saved Claude Code configuration: a prompt, one or more repositories, a set of connectors, and one or more triggers. Three trigger types:
- Scheduled — cron-style (hourly, daily, weekdays, weekly), in your local timezone
- API — HTTP POST to a trigger endpoint, under the
beta headerexperimental-cc-routine-2026-04-01 - GitHub events — PRs, releases, pushes
Each run spins up a fresh Claude Code cloud session on Anthropic’s infrastructure: clean clone of the repo from default branch, access to all connected MCP servers (Slack, Linear, Google Drive, whatever you’ve attached), skills committed to the repo, environment variables, and a setup script that runs before the session starts.
Daily quotas: Pro — 5 runs, Max — 15, Team/Enterprise — 25.
The Safety Default That Matters
By default, Claude can only push to branches prefixed with
. You have to explicitly enable unrestricted branch pushes per repository. This is the detail that makes Routines production-viable. Without it, you’d need human review before every merge, which defeats the point of automation.claude/
What This Actually Replaces
GitHub Actions + cron + whatever glue scripts you wrote at 2 AM. The mental model shifts from “I need to build a pipeline that calls Claude” to “I need to define what Claude should do, and when.”
For my work on Botyard and Hanse Agency, the immediate use cases are obvious: nightly code review on all active repos, auto-response to Sentry alerts with a PR draft, weekly dependency updates with test runs. Previously each of these required a dedicated CI setup. Now it’s one routine.
Desktop Redesign Alongside
Same day, Anthropic shipped a complete redesign of the Claude Code desktop app:
- Sidebar with parallel sessions grouped by project
- Drag-and-drop workspace layout
- Integrated terminal and file editor
- Side chat shortcut (
) to branch questions without polluting the main threadCmd+; - Auto-archive of sessions when PR merges or closes
The app is no longer a chat interface with tools bolted on. It’s a parallelism-oriented workspace where you run multiple Claude sessions simultaneously on different parts of the same problem.
🧠 Claude Opus 4.7 — The Benchmark Jump
Model ID:
claude-opus-4-7. Available in claude.ai (Pro/Max/Team/Enterprise), Anthropic API, Bedrock, Vertex AI, Microsoft Foundry, and GitHub Copilot.
Pricing unchanged from Opus 4.6: $5 / $25 per million input/output tokens. But the new tokenizer counts the same text at 1.0–1.35x more tokens — your actual bill goes up, especially on multilingual content.
The Numbers
| Benchmark | Opus 4.6 | Opus 4.7 | Δ |
|---|---|---|---|
| SWE-bench Verified | 80.8% | 87.6% | +6.8 |
| SWE-bench Pro | 53.4% | 64.3% | +10.9 |
| CursorBench | 58% | 70% | +12 |
| Visual acuity | 54.5% | 98.5% | +44 |
A 10.9-point jump on SWE-bench Pro in a single release is not iteration — it’s a regime change.
What Shifted in Practice
Coding. On Anthropic’s internal 93-task benchmark, Opus 4.7 solves 13% more tasks, including four that neither Opus 4.6 nor Sonnet 4.6 could crack. Hex specifically notes that the model reports missing data honestly instead of confabulating — critical for production agents.
Vision. Image resolution jumped from 1,568 px to 2,576 px on the long edge (~3.75 MP, 3x more). For technical diagrams, UI screenshots, chemical structures, CAD drawings — this changes what’s possible in one prompt.
Agentic tool-use. Gains on MCP-Atlas and Terminal-Bench — the category that determines whether agents can run unsupervised. Warp confirmed Opus 4.7 passed concurrency bugs that Opus 4.6 couldn’t.
New Features
effort level — abovexhigh
, for the hardest taskshigh- Task budgets (beta) — hard token ceiling per task
in Claude Code — simulates a senior reviewer, catching design flaws and logic gaps/ultrareview- Auto mode for Max plan — autonomous decisions without per-step permission
Where Opus 4.7 Does Not Lead
Honest assessment: not a clean sweep. GPT-5.4 still leads on agentic search (89.3% vs 79.3%) and raw terminal coding. Multilingual Q&A is not Anthropic’s strength. If you’re building a search agent, Opus 4.7 isn’t a mandatory upgrade.
🎨 Claude Design — The Product Move
April 17, one day after Opus 4.7. A research preview that generates slides, prototypes, marketing one-pagers, and UIs from text prompts. Available to Pro/Max/Team/Enterprise via the palette icon in claude.ai. For Enterprise, off by default — admins enable manually.
Figma stock dropped ~7% the same day.
What Makes It Different from Figma AI and Lovable
Onboarding works like this: Claude reads your codebase and design files, then builds a design system from them — colors, typography, components. Every subsequent project applies that system automatically. Teams can maintain multiple systems in parallel.
Inputs: text prompts, DOCX/PPTX/XLSX uploads, codebase connection, web capture (grab elements from your live site so prototypes match the real product).
Iteration happens through conversation, inline comments, direct edits, or custom sliders. Brilliant and Datadog, as early access partners, describe compressing a week-long cycle (brief → mockup → review) into a single session.
Final step: the design plus design intent gets handed to Claude Code for implementation. This is the kill shot — not Figma-to-Figma, but prompt-to-production through one engine.
🦋 Project Glasswing — Why Two Models
On April 7, Anthropic announced Claude Mythos Preview — a closed model that “has reached a level of coding capability where it can surpass all but the most skilled humans at finding and exploiting software vulnerabilities.”
Numbers cited:
- 93.9% on SWE-bench Verified
- 94.6% on GPQA Diamond
- 83% on CyberGym
- Thousands of high-severity vulnerabilities across all major OS and browsers
- A 27-year-old bug in OpenBSD, a 16-year-old bug in FFmpeg
Mythos is not being released publicly. Access is via consortium: AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks. $100M usage credits for 40+ critical-infrastructure organizations, $4M donations to open-source security.
The Register rightly noted that independent CVE counts are missing — VulnCheck found at most 40, possibly zero publicly verified. The marketing runs ahead of the verification.
But the number isn’t the point. Anthropic explicitly split models into a public workhorse (Opus 4.7) and a closed frontier (Mythos). This is a new industry pattern — previously the top model always shipped to API.
🎯 What This Means for Builders
If you build agents on Claude, here’s the checklist for the next two weeks:
1. Pilot Routines on one real workflow. Nightly code review, Sentry-to-PR automation, dependency updates. Pick the boring, recurring one. The
claude/* branch default makes it safe to deploy.
2. Re-test production tasks on Opus 4.7. Low effort on 4.7 ≈ medium effort on 4.6. You can cut costs without losing quality — but measure on your own tasks, not benchmarks.
3. Recalculate token budgets. The new tokenizer eats more, especially on multilingual text (German and Russian together — hello fellow DACH builders). Batch workflows need verification.
4. Try
on CLAUDE.md projects. For repos with detailed style guides, this is an immediate upgrade. I’m running it against Botyard CRM code now./ultrareview
5. Use vision resolution deliberately. 3.75 MP opens doors for UI screenshots, CAD drawings, technical diagrams. Tasks that previously required OCR pipelines are now single prompts.
6. Try Claude Design on one real project. Not to replace Figma, but to understand where it’s faster. For landing pages, client presentations, feature prototypes — likely yes. For production design systems — not yet.
💬 The Bigger Picture
Three signals, one pattern:
Opus 4.7 — targeted iteration, not revolution. Coding, vision, tool-use, instruction discipline. Mandatory config update for agent builders.
Routines — the infrastructure layer. Anthropic is eating the automation space that used to belong to GitHub Actions and dedicated CI tools.
Claude Design — the product-for-end-users layer. First shot across Figma’s bow.
Combined: Anthropic is assembling a full stack — model → automation → end-user product. Like OpenAI with ChatGPT, but with enterprise segmentation and a clearer technical identity.
The backdrop: $30B annualized revenue (early April, per Bloomberg), VC offers valuing the company at $800B, and conversations with Goldman Sachs, JPMorgan, and Morgan Stanley about a potential October IPO.
For those of us building on top of this stack, the implication is concrete: the “I’ll build you a prototype” service layer is dying faster than the “I’ll integrate this into your system” layer. Systems thinking is the defensible skill. Prompt engineering is a temporary one.
The next release is probably Sonnet 4.7 — and that’s the one most teams will actually run in production. For now: update
claude-opus-4-6 to claude-opus-4-7 in configs, pilot one Routine, try Claude Design on the next landing page, and measure.
Yevgen “Esso” Somochkin, Hamburg · esso.dev · hanse.agency


Comments
Loading comments...