Is it safe to let AI issue refunds?

Only with an identity gate and a monetary ceiling — this plugin has both, enforced at the tool level, not as suggestions the model might skip. The Identity Verifier confirms the requester against your CRM and identity provider before any account action runs. An Authorization Policy Verifier then blocks anything exceeding the verified identity's authority or a per-transaction ceiling you configure (the default routes to human review above $200 or for repeat refunds within 30 days). All actions execute from a structured action plan, never from raw customer text — the firebreak that stops prompt-injection attacks like the Chevrolet dealer incident (AI Incident Database #622, Dec 2023).

Does it work with Intercom, Zendesk, or HubSpot?

Included connectors cover Intercom (help desk), HubSpot (CRM), Guru and Notion (knowledge base), Slack (chat), Microsoft 365 (email and cloud storage), and Atlassian (project tracker). The plugin is tool-agnostic by connector category — if your platform is not covered by an included connector, you can bring your own for any slot. Zendesk support requires an MCP connector you configure; the plugin architecture does not require a specific vendor in any category.

How is this different from Anthropic's official customer-support plugin?

Anthropic's plugin is an excellent starting point — five well-crafted skills for triage, research, response drafting, escalation, and knowledge-base writing, all of which this plugin keeps as its foundation. The original is designed as a co-pilot: it ends each skill with a permission question and touches your systems read-only; a human still drives every ticket. This plugin adds the autonomous operating layer on top: an Intake Watcher that picks up tickets independently, write-back connectors that act on real systems, a durable Case Record that carries context across every turn, and adversarial verifier gates that block wrong answers and unauthorised actions before they reach a customer. More detail in the 'What Anthropic's Plugin Does' section above.

Can it run unattended?

Yes, with a scheduler you configure. The plugin supplies the autonomous multi-agent architecture and the safety gates. To run lanes unattended, pair it with a runtime that wakes the autonomous-support-loop skill on your ticket queue — a cron job, an OpenClaw agent, or a Cowork background session. The plugin does not ship live credentials and does not run itself. The recommended approach is to start with one lane in draft-only mode, confirm the groundedness and escalation metrics are in the green, then switch on autonomous sending lane by lane.

What happens when it is not sure?

The system has two hard fail-closed gates and no soft 'probably fine' path. If a customer-facing factual or policy claim cannot be tied to a knowledge-base source reviewed within 90 days, the send is blocked and the case routes to a human with the exact unverifiable claim flagged. If identity cannot be verified before an account action, the action is denied and a human receives the request with the identity evidence and a recommended next step. A tone gate also blocks hostile or off-brand output before it ships.

How much does it cost?

The plugin is free and open source. You need an active Claude Code or Cowork subscription and the connectors to your existing systems — Intercom, HubSpot, and so on at their own pricing. There is no additional charge for the plugin itself. AI Heroes offers paid implementation, connector configuration, and operator dashboard design for teams who want a managed setup. Book a call to discuss.

Free & Open Source

Your Support Queue, Running Itself — Safely

Name: Autonomous Customer Support Plugin for Claude
Author: AI Heroes

Most AI support tools draft a reply and then wait. This one picks up the ticket, finds a sourced answer, checks its own work, and either sends or escalates — while a human watches a dashboard instead of approving every response.

Built directly on Anthropic's official customer-support plugin. Pairs with a scheduler you configure for unattended operation. Does not ship live credentials and does not run itself — you switch on each lane deliberately, once you've watched it run in draft mode.

Get the Free Plugin

Why Most AI Support Breaks — and What This Does Differently

Every documented AI support failure follows the same pattern: the model answered confidently, nothing checked it, and no gate stopped it before it reached a customer. An airline was held legally liable for a refund policy its chatbot invented. A developer-tools company's bot fabricated a login restriction that never existed — customers cancelled before anyone could intervene. A car dealer's bot was tricked into agreeing to sell a $76,000 SUV for $1. The failure is never the model. It is the absence of a gate. This plugin is that gate, made real and wired into your systems.

How AI support breaks in production

✗The bot states a policy, rate, or certification it invented — and sends it. The customer acts on it before anyone catches it.
✗The ticket gets escalated. The customer is asked to re-explain everything from scratch because nothing survived the handoff.
✗A refund or account change goes through on a request nobody verified was legitimate.
✗The ticket is closed and the dashboard looks green. Two days later the same problem is back.

How this plugin is engineered differently

✓Every customer-facing factual or policy claim must trace to a knowledge-base source reviewed within 90 days — or the send is blocked, not softened.
✓A single Case Record travels with every ticket across every agent and every handoff. The customer never re-explains. The human escalation packet is always complete.
✓No refund, plan change, or credential reset runs without the Identity Verifier confirming who's asking — against your CRM and identity provider, fail-closed.
✓A Drift Monitor tracks whether 'resolved' actually held — reopen rate, CSAT delta, and knowledge-base staleness all alert before a quality regression becomes a pattern.

One Ticket, From Arrival to Resolution — No Human Required

Here is what actually happens when a support ticket arrives. Follow it step by step through the agents. The gates are not suggestions — they fail closed.

1
Ticket arrives
A customer emails about a billing error — or a message lands in your help desk. No human sees it yet. The queue is being watched.
2
Intake Watcher
The Intake Watcher detects the new ticket and opens a Case Record: a structured, durable object that will travel with this ticket through every agent and every handoff. The customer's account history, the original message, and all subsequent context accumulate here — so no one ever has to ask the customer to repeat themselves.
3
Triage Agent
The Triage Agent classifies the ticket: category (billing, account access, bug report, policy question), and priority P1 through P4. It checks your knowledge base and project tracker for known matching issues, then writes the classification back to your help desk. If the ticket scores P1, or trips a legal or security keyword, it pages on-call immediately and stops — autonomous lanes never run on P1.
4
Router
The Router reads the classification from the Case Record and picks the right lane: answer lane for resolvable questions, action lane if an account change is needed, escalation lane if a bug needs engineering. Lanes run in parallel if both an answer and an action are needed.
5
Researcher + Composer
In the answer lane: the Researcher queries your knowledge base, the ticket history, and CRM notes — attaching a citation to every factual or policy claim it finds. The Composer then drafts the customer-facing reply, drawing only on cited material.
6
Grounding Verifier (gate — can block)
The Grounding Verifier independently checks every factual and policy statement in the draft. One question: is this claim tied to a knowledge-base source reviewed within the last 90 days? If yes, the draft passes. If not, the claim is stripped or the entire send is blocked and the case routes to a human with the exact sentence flagged. There is no 'probably fine' path.
7
Identity Verifier + Authorization Verifier (gates — can block)
If an account action is needed — a refund, a plan change, a credential reset — the Identity Verifier runs first. It checks the requester against your CRM and identity provider. If identity cannot be confirmed, the action is denied immediately. If identity passes, the Authorization Policy Verifier checks whether the action is within the verified identity's authority and below the monetary ceiling. Both must pass before a single write happens. Actions are built from a structured plan, never from the raw customer message — this is the firebreak that stops prompt-injection.
8
Sender / Action Agent
Gates passed: the Sender posts the reply to your help desk or sends via email. If an account action is approved, the Action Agent executes it — issues the refund, adjusts the plan — and logs the action to your CRM. The Case Record is updated with what was sent and what was done.
9
KB Author + Drift Monitor
Once the ticket is resolved, the KB Author checks for a duplicate article and creates or updates the relevant entry in your knowledge base automatically. The Drift Monitor then watches this cluster of issues over time — if the reopen rate climbs or satisfaction drops, it files a tracker issue before you have a Klarna-style quality regression on your hands.
10
Operator dashboard
A human only enters the picture if a threshold trips — an unverifiable claim blocked, an identity check failed, a P1 detected, or a metric out of range. Everything else is a dashboard signal, not a task. Every lane has a kill-switch: flip it and that lane returns to draft-only instantly, without downtime.

Six Things the System Does That a Chatbot Cannot

Nineteen specialist agents working as a team. Several have the power only to block each other — the adversarial verifiers exist for one purpose: stopping a wrong answer, an unverified action, or an incomplete handoff before it reaches a customer or your systems.

📥

Watches the queue and opens every ticket

An Intake Watcher subscribes to your help desk and support inbox. When a ticket arrives, it opens a durable Case Record, classifies priority (P1–P4) and category, writes the result back to your platform, and routes it — before any human reads it. P1 and legal or security tickets page on-call immediately.

🔎

Finds a sourced answer and blocks anything it cannot prove

A Researcher retrieves a cited answer from your knowledge base. A Composer drafts the reply. Then an adversarial Grounding Verifier independently checks every factual and policy claim: if it cannot trace to a source reviewed within 90 days, the send is blocked — not softened, not reworded. The Sender only acts on a clean pass.

🛡️

Verifies identity and enforces limits before any account action

Before any refund, plan change, or credential reset, an Identity Verifier confirms who is asking against your CRM and identity provider. Actions run only from a structured action plan — never from raw customer text, which is the firebreak against prompt-injection. An Authorization Policy Verifier blocks anything outside the verified identity's authority or a configured monetary ceiling.

🔁

Escalates with a complete packet and follows up without prompting

An Escalation Packager builds a structured brief from the Case Record. An Escalation Coordinator blocks any packet missing reproduction steps or impact evidence — incomplete handoffs cannot proceed. Once filed, a Follow-up Agent watches the tracker issue and updates the customer when engineering responds, from the same Case Record. The customer never re-explains.

📚

Self-Updating Knowledge Base

After every resolved ticket, a KB Author runs a dedup check and creates or updates the relevant article in your knowledge base automatically. A Drift Monitor then tracks whether those resolutions hold — filing a tracker issue if a cluster's reopen rate or CSAT starts to slip.

📊

Operator dashboard with per-lane kill-switches

Nine real-time signals page a human only when a threshold trips: P1 recall, mis-route rate, answer groundedness, hallucination escapes, account actions without verified identity, reversed actions, handoff completeness, reopen rate, and CSAT delta vs. human-handled. Every lane has a kill-switch: flip it and that lane returns to draft-only — no downtime, no cascading failure.

Built on Anthropic's Official Customer-Support Plugin

This plugin starts where Anthropic's official customer-support plugin leaves off — it does not replace it. Anthropic ships five well-crafted skills for support work. They are the foundation. Every skill, framework, and tone guideline is kept intact. What this plugin adds is the autonomous, multi-agent operating layer that the original deliberately leaves to the builder: the queue watcher, the write-back connectors, the shared case memory, and the adversarial gates that check each other's output. Here is exactly what Anthropic's plugin does in each area, and what the autonomous layer turns it into.

View Anthropic's official customer-support plugin on GitHub

Ticket Triage

Out of the box

You run /triage on a ticket. The skill classifies category, priority (P1–P4), and sentiment, checks for known issues, and ends with: 'Want me to draft a full response? Want me to escalate this?'

With the autonomous agents

An Intake Watcher picks up every new ticket automatically. It opens a Case Record, runs the same triage classification, writes priority and category back to your help desk, and routes the ticket — before any human reads it. You are no longer in the loop for the triage step.

Customer Research

Out of the box

You run /research on a ticket. The skill searches your knowledge base, CRM, and internal channels, synthesises what it finds, and ends with: 'Want me to draft a response based on this? Want me to create a FAQ entry?'

With the autonomous agents

The Researcher runs as part of the autonomous answer lane, attached to every ticket automatically. Its findings are written into the Case Record so every subsequent agent — including any human escalation — has the full picture without re-running the search.

Response Drafting

Out of the box

You run /draft-response. The skill drafts a reply with internal notes, then asks: 'Want me to adjust the tone? Want me to draft the escalation note as well?' A human reviews and sends.

With the autonomous agents

The Composer drafts; then an adversarial Grounding Verifier checks every factual and policy claim against a knowledge-base source reviewed within 90 days. Any claim without a fresh source is stripped or the send is blocked. Only on a clean pass does the Sender post the reply to your help desk — no human in the approval chain.

Escalation Packaging

Out of the box

You run /customer-escalation. The skill builds a structured brief and ends with: 'Want me to post this in a chat channel? Want me to set a follow-up reminder?'

With the autonomous agents

An Escalation Coordinator blocks any packet missing reproduction steps or impact evidence — incomplete handoffs cannot proceed. On pass, a Tracker Agent files the issue in your project tracker and posts to team chat. A Follow-up Agent then watches the issue and updates the customer when engineering responds, from the same Case Record.

Knowledge Base Writing

Out of the box

You run /kb-article after a resolution. The skill drafts an article and asks: 'Want me to check if a similar article already exists?'

With the autonomous agents

After every resolved ticket, the KB Author runs the dedup check automatically and creates or updates the article in your knowledge base. A Drift Monitor then tracks whether those resolutions hold over time — filing a tracker issue when a cluster's reopen rate or satisfaction score begins to slip.

Is This the Right Tool?

Best Fit

SaaS and subscription businesses handling high volumes of repetitive Tier-1 tickets — billing, plan changes, account access, policy questions
Support teams already using Intercom, HubSpot, Slack, or Microsoft 365 who want autonomous operation without replacing their stack
Teams who need an auditable, policy-safe layer for refunds and account actions — every action identity-verified, ceiling-bounded, and logged
Teams who deployed a support chatbot, found the quality or safety gap, and need the gate layer that was missing

Not the Right Fit

—Teams with no knowledge base or CRM — the Grounding Verifier needs a source of truth; without one, most answers route to humans and the autonomous value is minimal
—Businesses where customer interactions require clinical judgment, regulated professional advice, or deep human empathy — healthcare, legal, financial advisory
—Anyone who wants fully automated support with no human in the loop — this system is designed for a human-watches-metrics model, not a zero-human one

From Install to First Autonomous Reply in Four Steps

1Install in Claude Code or Cowork with two commands — under two minutes
2Connect your existing stack: help desk, CRM, knowledge base, chat, and email
3Run in draft-only mode first — watch the dashboard, confirm the gates are working
4Open lanes one at a time as trust builds. The human handles the rare escalation; the agents handle everything else

Start With One Lane. Watch It Work.

Two commands to install. Every lane starts in draft-only mode — no autonomous action until you decide it's ready. Free, open source, and built on Anthropic's official plugin foundation.

No spam, ever. Your email is stored securely so we can send you updates about new use cases and workflows.

Step-by-Step: Install the Plugin

Add the plugin source

In Claude Code or Cowork, run: claude plugins marketplace add mlobo2012/Autonomous-Customer-Support-Plugin — this registers the repository as a trusted source.

Install the plugin

Run: claude plugins install autonomous-customer-support — Claude downloads 19 specialist agents, 7 skills, 2 hard safety gates, and 2 JSON schemas.

Connect your systems

Open CONNECTORS.md and wire your help desk, CRM, knowledge base, chat, email, and project tracker. Included connectors cover Intercom, HubSpot, Guru, Notion, Slack, Microsoft 365, and Atlassian. Bring your own for any category.

Start in draft-only mode

Every lane starts as a co-pilot — it drafts and asks before acting. Pair with a scheduler to go unattended. Start with one lane, watch the nine dashboard signals, then open more as you build confidence.

Watch the dashboard — intervene only when a threshold trips

One engineer per shift owns the operator dashboard. Each lane has a kill-switch that drops it to draft-only without taking the system down. All autonomous actions are logged and auditable.

Frequently Asked Questions

An autonomous customer support agent is an AI system that picks up tickets without waiting for a human to trigger it, resolves what it can prove, and escalates the rest with a complete context packet. Unlike a chatbot that drafts a reply and waits for someone to click send, an autonomous agent takes real actions — posting replies, issuing refunds, filing bugs, updating the knowledge base — within pre-agreed safety gates. The human's job shifts from approving every reply to watching a metrics dashboard and handling the rare case the system cannot resolve confidently.

A chatbot presents a suggested response and waits for a human to approve and send it — it is a faster co-pilot, not a teammate. This plugin closes the loop: a team of specialist agents triages the ticket, retrieves a grounded answer, adversarially verifies every claim before it is sent, and takes real actions on your systems within safety gates. No human keystroke is required per ticket. The design difference is the gate layer — specifically the Grounding Verifier that blocks any claim without a fresh source, and the Identity Verifier that blocks any account action without a confirmed identity.

Sources & Research

Every design decision in this plugin is grounded in a documented, verifiable incident or peer-reviewed research — verified via live web search on 2026-05-24.

Moffatt v. Air Canada, 2024 BCCRT 149

BC Civil Resolution Tribunal held Air Canada liable for its chatbot's incorrect bereavement-fare advice, awarded C$812.02, and rejected the argument that the chatbot was 'a separate legal entity.' Establishes that a company is legally bound by what its support agent tells a customer.

Cursor 'Sam' support bot, AI Incident Database #1039 (Apr 2025)

Cursor's support bot told users their subscription was limited to one active device — a policy that did not exist. Misinformation spread on Hacker News and Reddit; users cancelled subscriptions. Co-founder Michael Truell publicly apologised: 'we have no such policy.' Root cause was a security update misread as a policy change.

Chevrolet dealer chatbot, AI Incident Database #622 (Dec 2023)

A ChatGPT-backed Chevrolet dealer chatbot was instructed to 'agree with anything' and agreed to sell a 2024 Chevy Tahoe for $1. The dealer did not honour it and pulled the chatbot. Demonstrates prompt-injection risk without an authorisation boundary.

DPD chatbot incident (18 Jan 2024)

DPD disabled the AI element of its chatbot after, following a system update, it swore at a customer, wrote a poem calling itself 'a useless chatbot,' and described DPD as 'the worst delivery firm in the world.' Demonstrates brand-safety risk without a tone gate.

Klarna AI assistant (Feb 2024 → mid-2025)

Klarna's AI assistant handled 2.3 million chats in 30 days — the equivalent of ~700 agents, 67% of all conversations — with ~82% faster resolution. By 2025, CEO Sebastian Siemiatkowski admitted quality dropped and was 'not sustainable,' and began rehiring humans into a hybrid model. The documented end-state is a hybrid model — exactly the human-watches-metrics operating model this plugin is designed for.

MGM Resorts / Scattered Spider (Sep 2023)

Scattered Spider impersonated an MGM employee on a ~10-minute call to the IT help desk, obtained access, escalated to Okta and Azure admin, and deployed ransomware. Reported ~US$100M Q3 impact for MGM. Establishes identity verification at the help desk as the load-bearing control.

Liu et al., 'Lost in the Middle' (arXiv:2307.03172, TACL 2024)

Retrieval accuracy degrades significantly when relevant information is buried in the middle of long context, even for long-context models. Justifies externalising case state into durable structured memory rather than relying on the chat transcript.

Yong et al., 'Low-Resource Languages Jailbreak GPT-4' (arXiv:2310.02446, NeurIPS 2023 SoLaR Best Paper)

Translating unsafe prompts into low-resource languages bypassed GPT-4 safety filters ~79% of the time on AdvBench. Root cause: linguistic inequality in safety training data. Justifies a dedicated non-English safety check before any non-English reply ships.

HubSpot Research / Dimensional Research (compiled 2024)

33% of customers say being made to repeat themselves to multiple support representatives is their single most frustrating experience. 72% say explaining their problem to multiple people is poor customer service. Justifies a durable Case Record that carries context across every agent turn and handoff.

Start With One Lane. Watch It Work.

Two commands to install. Every lane starts in draft-only mode — no autonomous action until you decide it's ready. Free, open source, and built on Anthropic's official plugin foundation.

Get the Free Plugin

Want This Running for Your Team — Without the Setup Overhead?

AI Heroes configures the full stack for SaaS and subscription businesses: connector wiring, identity gate setup, scheduler configuration, and operator dashboard design. Your team watches metrics from day one instead of spending weeks on integration.

Get in Touch Book a Discovery Call

Your Support Queue, Running Itself — Safely

Why Most AI Support Breaks — and What This Does Differently

How AI support breaks in production

How this plugin is engineered differently

One Ticket, From Arrival to Resolution — No Human Required

Ticket arrives

Intake Watcher

Triage Agent

Router

Researcher + Composer

Grounding Verifier (gate — can block)

Identity Verifier + Authorization Verifier (gates — can block)

Sender / Action Agent

KB Author + Drift Monitor

Operator dashboard

Six Things the System Does That a Chatbot Cannot

Watches the queue and opens every ticket

Finds a sourced answer and blocks anything it cannot prove

Verifies identity and enforces limits before any account action

Escalates with a complete packet and follows up without prompting

Self-Updating Knowledge Base

Operator dashboard with per-lane kill-switches

Built on Anthropic's Official Customer-Support Plugin

Ticket Triage

Customer Research

Response Drafting

Escalation Packaging

Knowledge Base Writing

Is This the Right Tool?

Best Fit

Not the Right Fit

From Install to First Autonomous Reply in Four Steps

Start With One Lane. Watch It Work.

Step-by-Step: Install the Plugin

Add the plugin source

Install the plugin

Connect your systems

Start in draft-only mode

Watch the dashboard — intervene only when a threshold trips

Frequently Asked Questions

Sources & Research

Moffatt v. Air Canada, 2024 BCCRT 149

Cursor 'Sam' support bot, AI Incident Database #1039 (Apr 2025)

Chevrolet dealer chatbot, AI Incident Database #622 (Dec 2023)

DPD chatbot incident (18 Jan 2024)

Klarna AI assistant (Feb 2024 → mid-2025)

MGM Resorts / Scattered Spider (Sep 2023)

Liu et al., 'Lost in the Middle' (arXiv:2307.03172, TACL 2024)

Yong et al., 'Low-Resource Languages Jailbreak GPT-4' (arXiv:2310.02446, NeurIPS 2023 SoLaR Best Paper)

HubSpot Research / Dimensional Research (compiled 2024)

Start With One Lane. Watch It Work.

Want This Running for Your Team — Without the Setup Overhead?