rubric

Skills tagged with #rubric

effort-estimation

Calibrate engineering effort estimates for git commits using a 5-tier rubric. Use whenever you need to translate a diff into an hours estimate for a senior engineer working with or without an AI coding assistant.

denn-gubsky/ai-dev-effectiveness

4d ago

@IgorGanapolsky

capture-feedback

Capture structured thumbs up/down feedback with context, tags, and optional rubric scores after completing a task.

IgorGanapolsky/mcp-memory-gateway+7 more

18d ago

@viktorbezdek

agent-project-development

This skill should be used when the user asks to "start an LLM project", "design batch pipeline", "evaluate task-model fit", "structure agent project", or mentions pipeline architecture, agent-assisted development, cost estimation, or choosing between LLM and traditional approaches. NOT for evaluating agent quality or building evaluation rubrics (use agent-evaluation), NOT for multi-agent coordination or agent handoffs (use multi-agent-patterns).

viktorbezdek/skillstack+48 more

3d ago

@greyhaven-ai

autocontext

Iterative strategy generation and evaluation system. Use when the user wants to evaluate agent output quality, run improvement loops, queue tasks for background evaluation, check run status, or discover available scenarios. Provides LLM-based judging with rubric-driven scoring.

greyhaven-ai/autocontext+1 more

19d ago

6640

@savvides

assessment-design

Evidence-based assessment design with rubrics, feedback strategies, and formative checkpoints. Aligns each assessment to learning objectives using Bloom's taxonomy. Applies Nicol's 7 principles of good feedback practice. Reads from /learning-objectives manifest and extends it with assessment specs. (idstack)

savvides/idstack+7 more

17d ago

@RefoundAI

ai-evals

Help users create and run AI evaluations. Use when someone is building evals for LLM products, measuring model quality, creating test cases, designing rubrics, or trying to systematically measure AI output quality.

RefoundAI/lenny-skills+73 more

19d ago

4230

@agentscope-ai

auto-arena

Automatically evaluate and compare multiple AI models or agents without pre-existing test data. Generates test queries from a task description, collects responses from all target endpoints, auto-generates evaluation rubrics, runs pairwise comparisons via a judge model, and produces win-rate rankings with reports and charts. Supports checkpoint resume, incremental endpoint addition, and judge model hot-swap. Use when the user asks to compare, benchmark, or rank multiple models or agents on a custom task, or run an arena-style evaluation.

agentscope-ai/OpenJudge+8 more

18d ago

4720

@syncfusion

multi-model-analysis

Runs 3 AI models in parallel to independently analyze a problem and propose approaches. Compares all proposals against a rubric and selects the best. Can be invoked standalone or called from any agent (issue-resolver, pr-review, etc.).

syncfusion/maui-toolkit+1 more

18d ago

6940