the quality gate for Agent Skills

Does your skill trip on
the right prompts?

Everything you need to confirm a skill works and follows best practices — in one tool. Tripwire lints your SKILL.md, probes which prompts actually activate it with real Claude sessions, and gates every PR in CI.

Add to your CI → Try it in the browser

npm install -g tripwire-skills

tripwire lint

Best-practice lint

Instant, offline checks: description starts with “Use when”, stays under 1024 chars, and avoids workflow summaries; name is kebab-case; the body has real examples and isn’t a stub.

tripwire analyze

Real activation coverage

Generates a prompt matrix across four zones and runs real Claude sessions to map what fires, what silently misses, and what false-triggers — the check a linter can’t do.

tripwire test

CI-ready regression gate

Commit the generated scenarios and rerun them deterministically — in CI or locally — to catch activation regressions before they ship.

the part a linter can’t catch

Activation coverage, measured against real runs

A skill’s description decides whether Claude invokes it. Tripwire generates prompts across four zones, runs them through real sessions, and reports exactly where your skill over- or under-fires.

should fire ✓Core triggersthe skill’s own stated use cases

should fire ✓Adjacent / edgerelated intents you didn’t think to test

should stay quiet ✗Negativeoff-topic prompts — catches false positives

should fire ✓Keyword variantsparaphrases — exposes description blind spots

Coverage Report: brainstorming
Core triggers      7/8 activated   (88%) ✓
Adjacent/edge      3/8 activated   (38%) ⚠
Negatives          1/8 activated   (13%) ✓
Keyword variants   4/5 activated   (80%) ✓

─ GAPS ───────────────────────────
✗ "what's the best way to approach X?"   [adjacent — miss]
✗ "brainstorm ideas for Y"               [variant — miss]
─ SUGGESTIONS ─────────────────────
1. Add "brainstorm", "think through" to the description
2. Cover "approach" questions — add "planning how to approach"

runs on every pull request

The same checks, as a GitHub Action

Gate skill changes the way you gate lint and tests. Add one workflow file — it lints changed skills, runs the coverage probe with your own API key in your own runner, annotates the diff, and posts a sticky summary. The key never leaves your CI.

# .github/workflows/tripwire.yml
name: Tripwire
on: pull_request
jobs:
  skills:
    runs-on: ubuntu-latest
    permissions: { contents: read, pull-requests: write }
    steps:
      - uses: actions/checkout@v4
        with: { fetch-depth: 0 }
      - uses: bharath31/tripwire@v1
        with:
          probe: true
          anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}

No key on a fork PR? The probe skips with a notice — lint still runs and gates the PR.

as your skills grow

More than lint, and more than one skill at a time

tripwire conflicts

Skill-set conflict detection

Two skills that each lint clean can still fight at runtime. Scan a whole directory for name collisions and description overlap — skills shadowing each other.

tripwire eval

Outcome evals

Did the skill actually do the right thing, not just fire? Author assertions plus an optional LLM-judge rubric per case.

tripwire test-all · --drift

Model-drift detection

A scenarios file that passed in March can silently regress in June. Re-probe on a schedule and fail loudly when a skill’s activation behavior drifts.

--agent gemini · codex

Cross-agent probing

Skills now run across 30+ agents on one SKILL.md. Check that yours fires in each agent you ship to — not just Claude Code.

tripwire.yaml

Pluggable rules

extends: tripwire:recommended, ESLint-style. Turn rules off, change severity, or add org-specific checks as plain JS — applied everywhere, including the Action.

badges · --fix

Badges & auto-fix

A live README coverage badge, plus one-command fixes for the mechanically-safe lint issues. Tripwire also ships as a skill and as an early VS Code extension.

Try the lint check, right here

The real lint engine, bundled to your browser — an instant taste. The full activation-coverage probe runs from the CLI and the Action (it needs real Claude sessions). Nothing leaves this page.

Load an example:

SKILL.md

Result

Does your skill trip onthe right prompts?