Skills

All Skills

evidence

Skills tagged with #evidence

@cybozu

agent-spec-builder

Build a Prompt Hardener agent_spec.yaml from an existing codebase (from-code) or through an interactive interview (from-questions). Use when the user wants to create, generate, or scaffold an agent spec, or when they mention agent_spec.yaml creation. Generates agent_spec.yaml, evidence.md, and open_questions.md with confidence tracking and evidence trails.

cybozu/prompt-hardener
18d ago
470
@alchemiststudiosDOTai

agents-md-mapper

This skill should be used when creating, refreshing, or validating a repository `AGENTS.md` so it stays concise, current, and grounded in repository evidence. Use when `AGENTS.md` is missing or stale, after refactors or tooling changes, when new docs become the system of record, or when adding lightweight drift checks.

alchemiststudiosDOTai/harness-engineering+7 more
18d ago
490
@sequenzia

bug-investigator

Executes diagnostic investigation tasks to test debugging hypotheses. Runs tests, traces execution, checks git history, and reports evidence. (converted from agent)

sequenzia/agent-alchemy+31 more
19d ago
330
@basilisk-labs

agentplane-release-and-packaging-operator

Use when preparing, validating, publishing, or recovering an Agentplane release, especially package build ordering, version parity, npm publication, public install smoke tests, hosted publish evidence, or release CI failures.

basilisk-labs/agentplane+1 more
8d ago
390
@PCIRCLE-AI

Agentic Orchestration (Experimental Working-Model Protocol)

> **Status — experimental, instrumented, validation in progress.** This > skill is shipped to begin collecting evidence about whether a structured > verifiability-router protocol changes Claude's behavior in ways that > measurably help users. `memesh patterns` exposes a local counter so you > can

PCIRCLE-AI/memesh-llm-memory+2 more
4d ago
110
@nickzren
MCP

Opentargets

Open Targets MCP server for targets, diseases, drugs, variants, and evidence

mcpgithub
nickzren/opentargets-mcp
19d ago
0
@jellewas
MCP

EU Audit Trail

Tamper-evident audit trail MCP server for EU AI Act & GDPR compliance.

mcpgithubai
jellewas/eu-audit-mcp
19d ago
0
@sibyllai

Governance layer for Claude Code Agent Teams — durable auditability, operational controls, and evidence trails for AI-assisted development.

Governance framework for AI agent team coordination, audit trails, and boundary enforcement.

sibyllai/khoregos
18d ago
60
@collaborative-deep-research

fact-check

Verify a specific claim by searching for evidence across web and academic sources. Use when the user asks to verify, fact-check, or confirm a statement.

collaborative-deep-research/agent-papers-cli+1 more
19d ago
380
@BrowseAI-HQ
MCP

BrowseAI Dev

Evidence-backed web research for AI agents with citations and confidence scores.

mcpgithubaisearchweb
BrowseAI-HQ/BrowseAI-Dev
19d ago
0
@openai

agentic-legibility

Score a repository's agentic legibility from repo-visible evidence only. Use when Codex needs to audit how easy a codebase is for coding agents to discover, bootstrap, validate, and navigate, especially for harness-engineering reviews, developer-experience audits, repo cleanup, or before/after comparisons after improving docs, tooling, or architectural constraints.

openai/build-hours
18d ago
7250
@mukul975

acquiring-disk-image-with-dd-and-dcfldd

Create forensically sound bit-for-bit disk images using dd and dcfldd while preserving evidence integrity through hash verification.

forensicsdisk-imagingevidence-acquisitiondddcflddhash-verification
mukul975/Anthropic-Cybersecurity-Skills+242 more
19d ago
2390
@nextor2k

hyperfocus

ADHD-friendly output formatting for Codex. Restructures responses with evidence-based cognitive accessibility: chunking, visual hierarchy, front-loaded key points, and progressive disclosure. Three modes: clean, flow (default), zen. Use when user says "hyperfocus", "focus mode", "adhd mode", "adhd friendly", or invokes /hyperfocus.

nextor2k/hyperfocus
18d ago
50
@tkersey

codex-upcoming-features

Fetch and summarize upcoming unreleased Codex features using a durable local clone synced from GitHub, with source-file mining as primary evidence. Use when asked for latest upcoming/openai-codex features, what is coming next but not in the latest stable release, or a live release-gap summary with links and as-of timestamp.

tkersey/dotfiles+25 more
18d ago
450
@savvides

assessment-design

Evidence-based assessment design with rubrics, feedback strategies, and formative checkpoints. Aligns each assessment to learning objectives using Bloom's taxonomy. Applies Nicol's 7 principles of good feedback practice. Reads from /learning-objectives manifest and extends it with assessment specs. (idstack)

savvides/idstack+7 more
18d ago
70
@0verL1nk

agentic_search

Run a local-first research workflow by planning sub-queries, iterating retrieval, validating sources, and producing traceable evidence-backed conclusions.

0verL1nk/PaperSage+3 more
18d ago
320
@ecomfe

figma-design-to-code

Implement or update project-consistent UI code from a Figma selection or nodeId using TemPad Dev MCP. Use when the user wants visible Figma UI recreated, ported, or integrated into the target project's framework, styling system, tokens, and existing components when available. Do not use for design critique, product invention, generic code review, or for guessing hidden states, responsiveness, or behavior not shown in design or project evidence.

ecomfe/tempad-dev
18d ago
4480
@rustfs

code-change-verification

Verify code changes by identifying correctness, regression, security, and performance risks from diffs or patches, then produce prioritized findings with file/line evidence and concrete fixes. Use when reviewing commits, PRs, and merged patches before/after release.

rustfs/rustfs+2 more
19d ago
23.2K0
@pretorin-ai
MCP

Pretorin Compliance

Access Pretorin compliance systems, controls, evidence, and narratives from your AI tools.

mcpgithubai
pretorin-ai/pretorin-cli.git
19d ago
0
@0xSero

evidence-heavy-evaluator

Generate an evidence-first, read-only repository evaluation report with deterministic scoring and actionable recommendations. Use when the user asks to assess readiness, maintainability, release-readiness, documentation gaps, or engineering health and wants auditable artifacts (`json` + `markdown` + raw command logs).

0xSero/vllm-studio
19d ago
2890
@mverab

content-scoring

Score content against the 10 GEO criteria with evidence and prioritized fixes. Use when users ask to score, rate, evaluate, or estimate ranking strength.

mverab/eGEOagents+2 more
18d ago
650
@garrytan

browse

Fast headless browser for QA testing and site dogfooding. Navigate any URL, interact with elements, verify page state, diff before/after actions, take annotated screenshots, check responsive layouts, test forms and uploads, handle dialogs, and assert element states. ~100ms per command. Use when you need to test a feature, verify a deployment, dogfood a user flow, or file a bug with evidence. Use when asked to "open in browser", "test the site", "take a screenshot", or "dogfood this".

garrytan/gstack+18 more
18d ago
10.3K0
@genomoncology

biomcp

Search and retrieve biomedical data - genes, variants, clinical trials, articles, drugs, diseases, pathways, proteins, adverse events, pharmacogenomics, and phenotype-disease matching. Use for gene function, variant pathogenicity, trials, drug safety, pathway context, disease workups, and literature evidence.

genomoncology/biomcp
18d ago
4610
@salespeak-ai

buyer-eval

Structured B2B software vendor evaluation for buyers. Researches your company, asks domain-expert questions, engages vendor AI agents via the Salespeak Frontdoor API, scores vendors across 7 dimensions, and produces a comparative recommendation with evidence transparency. Use when asked to evaluate, compare, or research B2B software vendors.

salespeak-ai/buyer-eval-skill
18d ago
490
@harumiWeb

adr-drafter

Draft a new ExStruct ADR or propose an update to an existing ADR from an issue, PR, diff, tests, and specs. Use when an ADR is required or recommended and you need a structured draft with context, decision, consequences, and evidence.

harumiWeb/exstruct+3 more
19d ago
1290
@happier-dev

happier-diagnose

Diagnose a problem with a Happier session, the daemon, a provider (Claude/Codex/OpenCode), auth, or connectivity. Pulls the correct logs, finds a true root cause from evidence only, presents findings, and optionally uploads a private diagnostics bundle to Happier developers and/or files a sanitized public GitHub issue (the two are complementary). Use when the user reports a bug, says Happier is broken/stuck/misbehaving, asks to debug/diagnose/triage/troubleshoot Happier, or shares a Happier session ID and asks what went wrong.

happier-dev/happier+3 more
10d ago
7310
@Blockether

Presenter Reference

Generate self-contained HTML files for technical diagrams, visualizations, and data tables. Use `spel open` to preview and `spel screenshot` to capture evidence.

Blockether/spel+1 more
18d ago
170
@bitflight-devops

Evaluate Options

Do not present options to the user without evidence. Every recommendation must be grounded in research, not assertion.

bitflight-devops/hallucination-detector
18d ago
60
@mikkelkrogsholm

lab-review

Cross-reference lab results from sundhed.dk against current PubMed and medRxiv research on optimal ranges. Generates a report comparing your values to the latest evidence-based guidelines, meta-analyses, and preprints.

mikkelkrogsholm/ai-laegens-bord+2 more
18d ago
370
@diskd-ai

ccbox

Inspect local agent session logs via `ccbox` CLI and produce quick, evidence-based insights.

diskd-ai/ccbox+1 more
18d ago
320
@jfrog

jfrog

Interact with the JFrog Platform via the JFrog CLI and REST/GraphQL APIs. Use this skill when the user wants to manage Artifactory repositories, upload or download artifacts, manage builds, configure permissions, manage users and groups, work with access tokens, configure JFrog CLI servers, search artifacts, manage properties, set up replication, manage JFrog Projects, run security audits or scans, look up CVE details, query exposures scan results from JFrog Advanced Security, manage release bundles and lifecycle operations, aggregate or export platform data, or perform any JFrog Platform administration task. Also use when the user mentions jf, jfrog, artifactory, xray, distribution, evidence, apptrust, onemodel, graphql, workers, mission control, curation, advanced security, exposures, or any JFrog product name.

jfrog/jfrog-skills+1 more
18d ago
60
@vercel-labs

dogfood

Systematically explore and test a web application to find bugs, UX issues, and other problems. Use when asked to "dogfood", "QA", "exploratory test", "find issues", "bug hunt", "test this app/site/platform", or review the quality of a web application. Produces a structured report with full reproduction evidence -- step-by-step screenshots, repro videos, and detailed repro steps for every issue -- so findings can be handed directly to the responsible teams.

vercel-labs/agent-browser+2 more
18d ago
22.0K0
@chojondocho

vowline

General operating skill for AI agents handling meaningful work across domains: ambiguous requests, multi-step execution, tool use, coding, debugging, research, writing, artifacts, planning, review, decisions, visual work, prompt work, and handoff. Use when intent inference, safe action, evidence, verification, concise reporting, or completion criteria matter, including alongside narrower active skills. Skip only trivial one-shot replies.

chojondocho/vowline
8d ago
120