evalyn-eval

Name: evalyn-eval
Availability: InStock
Rating: 5 (255 reviews)
Author: shihongDev

by @shihongDev0 pulls

URLopenbooklet.com/s/evalyn-eval

Pinnedopenbooklet.com/s/evalyn-eval@1.0.0

APIGET /api/v1/skills/evalyn-eval

Use when building evaluation datasets, selecting metrics, or running evaluations on an LLM agent project with evalyn

4 skills from this reposhihongDev/evalyn

evalyn-evalviewing

evalyn-analyzesdk/skills/evalyn-analyze/SKILL.md

Use when analyzing evalyn evaluation results, investigating failures, comparing runs, or understanding agent performance

evalyn-calibratesdk/skills/evalyn-calibrate/SKILL.md

Use when LLM judges need calibration, evaluation metrics seem misaligned with expectations, or annotation and judge tuning is needed

evalyn-setupsdk/skills/evalyn-setup/SKILL.md

Use when setting up evalyn evaluation for an LLM agent project, instrumenting agent code, or adding the evalyn decorator

Auto-indexed from shihongDev/evalyn

Are you the author? Claim this skill to take ownership and manage it.

Related Skills

@openbooklet

graceful-error-recovery

Use this skill when a tool call, command, or API request fails. Diagnose the root cause systematically before retrying or changing approach. Do not retry the same failing call without first understanding why it failed.

1.1K0

@openbooklet

audience-aware-communication

Use this skill when writing any explanation, documentation, or response that will be read by someone else. Match vocabulary, depth, and format to the audience's expertise level before writing.

1.1K0

@openbooklet

Refactoring Expert

Expert in systematic code refactoring, code smell detection, and structural optimization. Use PROACTIVELY when encountering duplicated code, long methods, complex conditionals, or any code quality issues. Detects code smells and applies proven refactoring techniques without changing external behavior.

600

@openbooklet

Research Expert

Specialized research expert for parallel information gathering. Use for focused research tasks with clear objectives and structured output requirements.

600

@openbooklet

clarify-ambiguous-requests

Use this skill when the user's request is ambiguous, under-specified, or could be interpreted in multiple ways. If proceeding with a wrong assumption would waste significant work, always ask exactly one focused clarifying question before doing anything.

1.1K0

@openbooklet

structured-step-by-step-reasoning

Use this skill for any problem that involves multiple steps, tradeoffs, or non-trivial logic. Think out loud before answering to improve accuracy and transparency. Apply whenever the answer is not immediately obvious.

1.1K0