MCP

3 articles

Editorial pen-and-watercolour schematic of a Claude Managed Agents system: a lead agent carrying the Claude wordmark delegating to specialist sub-agents on a shared filesystem, with a separate grader checking output against a rubric and a dreaming memory-curation loop
Claude Launch AnalysisClaudeAnthropic

How Claude Managed Agents Actually Work: Dreaming, Outcomes, Multiagent Orchestration, and Webhooks (2026)

Anthropic gave Claude Managed Agents four new mechanics at Code w/ Claude: Dreaming, Outcomes, Multiagent Orchestration, and Webhooks. The one that changes how you build is Outcomes — a separate grader that loops the agent until a rubric is met. Here is how each one works, and when to reach for it.

Marco Lobo
Marco Lobo·25 May 2026·9 min read
Handdrawn editorial diagram of the Generator-Evaluator harness pattern — a three-agent triangle with a Planner agent expanding a 1-4 sentence prompt into a product spec, a Generator agent building feature-by-feature using a React + Vite + FastAPI + SQLite stack, and an Evaluator agent using Playwright MCP to navigate the live app and grade against design quality, originality, craft, and functionality criteria; file-based handoff arrows between the three agents; by Anthropic Labs wordmark top-right, Claude Agent SDK badge bottom-right
AI EngineeringClaude Agent SDKAnthropic

Harness Design for Long-Running AI Applications: Inside Anthropic's Generator-Evaluator Pattern (Claude Agent SDK, 2026)

On 24 March 2026 Anthropic Labs engineer Prithvi Rajasekaran published the most rigorous public account to date of how Anthropic designs harnesses for long-running AI applications — a GAN-inspired generator-evaluator pattern applied across two unusually different domains: frontend design (subjective, no binary verification) and full-stack coding (objective, machine-verifiable). The piece evolves the November 2025 Initializer + Coding Agent baseline into a three-agent planner + generator + evaluator architecture, with concrete cost-and-duration data ($200 / 6h on a retro game maker test, then $124 / 4h on a more ambitious DAW after the Opus 4.6 simplification pass). Inside the pattern, the two failure modes it fixes (context anxiety + self-evaluation bias), how it compares to LangGraph / AutoGen / OpenAI Assistants v2 / Devin, when it doesn't fit, and the canonical principle every team operating a harness should adopt: stress-test every component against the current model.

Marco Lobo
Marco Lobo·22 May 2026·13 min read
Handdrawn enterprise AI agent fleet diagram with role agents connected to departments, systems, memory, and governance layers
AI GuidesAI AgentsClaude

Building AI Agents in the Enterprise: Implementation Patterns for 2026

Anthropic's playbook is right about the enterprise shape. The missing layer is implementation: governed skills, MCP tools, memory, observability, worktree-safe orchestration, and agent fleets that survive contact with a 1,000-person company.

Marco Lobo
Marco Lobo·19 May 2026·11 min read