Skills

All Skills

rubric

Skills tagged with #rubric

@denn-gubsky

effort-estimation

Calibrate engineering effort estimates for git commits using a 5-tier rubric. Use whenever you need to translate a diff into an hours estimate for a senior engineer working with or without an AI coding assistant.

denn-gubsky/ai-dev-effectiveness
4d ago
70
@IgorGanapolsky

capture-feedback

Capture structured thumbs up/down feedback with context, tags, and optional rubric scores after completing a task.

IgorGanapolsky/mcp-memory-gateway+7 more
18d ago
70
@viktorbezdek

agent-project-development

This skill should be used when the user asks to "start an LLM project", "design batch pipeline", "evaluate task-model fit", "structure agent project", or mentions pipeline architecture, agent-assisted development, cost estimation, or choosing between LLM and traditional approaches. NOT for evaluating agent quality or building evaluation rubrics (use agent-evaluation), NOT for multi-agent coordination or agent handoffs (use multi-agent-patterns).

viktorbezdek/skillstack+48 more
3d ago
50
@greyhaven-ai

autocontext

Iterative strategy generation and evaluation system. Use when the user wants to evaluate agent output quality, run improvement loops, queue tasks for background evaluation, check run status, or discover available scenarios. Provides LLM-based judging with rubric-driven scoring.

greyhaven-ai/autocontext+1 more
19d ago
6640
@savvides

assessment-design

Evidence-based assessment design with rubrics, feedback strategies, and formative checkpoints. Aligns each assessment to learning objectives using Bloom's taxonomy. Applies Nicol's 7 principles of good feedback practice. Reads from /learning-objectives manifest and extends it with assessment specs. (idstack)

savvides/idstack+7 more
17d ago
70
@RefoundAI

ai-evals

Help users create and run AI evaluations. Use when someone is building evals for LLM products, measuring model quality, creating test cases, designing rubrics, or trying to systematically measure AI output quality.

RefoundAI/lenny-skills+73 more
19d ago
4230
@agentscope-ai

auto-arena

Automatically evaluate and compare multiple AI models or agents without pre-existing test data. Generates test queries from a task description, collects responses from all target endpoints, auto-generates evaluation rubrics, runs pairwise comparisons via a judge model, and produces win-rate rankings with reports and charts. Supports checkpoint resume, incremental endpoint addition, and judge model hot-swap. Use when the user asks to compare, benchmark, or rank multiple models or agents on a custom task, or run an arena-style evaluation.

agentscope-ai/OpenJudge+8 more
18d ago
4720
@syncfusion

multi-model-analysis

Runs 3 AI models in parallel to independently analyze a problem and propose approaches. Compares all proposals against a rubric and selects the best. Can be invoked standalone or called from any agent (issue-resolver, pr-review, etc.).

syncfusion/maui-toolkit+1 more
18d ago
6940