bib-verify

Name: bib-verify
Availability: InStock
Rating: 5 (472 reviews)
Author: agentscope-ai

by @agentscope-ai0 pulls

URLopenbooklet.com/s/bib-verify

Pinnedopenbooklet.com/s/bib-verify@1.0.0

APIGET /api/v1/skills/bib-verify

Verify a BibTeX file for hallucinated or fabricated references by cross-checking every entry against CrossRef, arXiv, and DBLP. Reports each reference as verified, suspect, or not found, with field-level mismatch details (title, authors, year, DOI). Use when the user wants to check a .bib file for fake citations, validate references in a paper, or audit bibliography entries for accuracy.

9 skills from this repoagentscope-ai/OpenJudge

bib-verifyviewing

auto-arenaskills/auto-arena/SKILL.md

Automatically evaluate and compare multiple AI models or agents without pre-existing test data. Generates test queries from a task description, collects responses from all target endpoints, auto-generates evaluation rubrics, runs pairwise comparisons via a judge model, and produces win-rate rankings with reports and charts. Supports checkpoint resume, incremental endpoint addition, and judge model hot-swap. Use when the user asks to compare, benchmark, or rank multiple models or agents on a custom task, or run an arena-style evaluation.

Claude Authenticity Skillskills/claude-authenticity/SKILL.md

Verify whether an API endpoint serves genuine Claude and optionally extract any injected system prompt.

Find Skills Comboskills/find-skills-combo/SKILL.md

Discover and install **skill combinations** from the open agent skills ecosystem. Unlike single-skill search, this skill decomposes complex tasks into subtasks, searches for candidates per subtask, evaluates coverage, and recommends two strategies: **Maximum Quality** (best skill per subtask, highes

mmx-cliskills/mmx-cli/SKILL.md

Generate text, images, video, speech, and music via the MiniMax AI platform. Covers text generation (MiniMax-M2.7 model), image generation (image-01), video generation (Hailuo-2.3), speech synthesis (speech-2.8-hd, 300+ voices), music generation (music-2.6 with lyrics, cover, and instrumental), and web search. Use when the user needs to create AI-generated multimedia content, produce narrated audio from text, compose music, or search the web through MiniMax AI services.

openjudgeskills/openjudge/SKILL.md

Build custom LLM evaluation pipelines using the OpenJudge framework. Covers selecting and configuring graders (LLM-based, function-based, agentic), running batch evaluations with GradingRunner, combining scores with aggregators, applying evaluation strategies (voting, average), auto-generating graders from data, and analyzing results (pairwise win rates, statistics, validation metrics). Use when the user wants to evaluate LLM outputs, compare multiple models, design scoring criteria, or build an automated evaluation system.

paper-reviewskills/paper-review/SKILL.md

Review academic papers for correctness, quality, and novelty using OpenJudge's multi-stage pipeline. Supports PDF files and LaTeX source packages (.tar.gz/.zip). Covers 10 disciplines: cs, medicine, physics, chemistry, biology, economics, psychology, environmental_science, mathematics, social_sciences. Use when the user asks to review, evaluate, critique, or assess a research paper, check references, or verify a BibTeX file.

ref-hallucination-arenaskills/ref-hallucination-arena/SKILL.md

Benchmark LLM reference recommendation capabilities by verifying every cited paper against Crossref, PubMed, arXiv, and DBLP. Measures hallucination rate, per-field accuracy (title/author/year/DOI), discipline breakdown, and year constraint compliance. Supports tool-augmented (ReAct + web search) mode. Use when the user asks to evaluate, benchmark, or compare models on academic reference hallucination, literature recommendation quality, or citation accuracy.

rl-rewardskills/rl-reward/SKILL.md

Build RL reward signals using the OpenJudge framework. Covers choosing between pointwise and pairwise reward strategies based on RL algorithm, task type, and cost; aggregating multi-dimensional pointwise scores into a scalar reward; pairwise tournament reward for GRPO on subjective tasks (net win rate across group rollouts); generating preference pairs for DPO/RLAIF; and normalizing scores for training stability. Use when building reward models, scoring rollouts for GRPO/REINFORCE, generating preference data for DPO, or doing Best-of-N selection.

Auto-indexed from agentscope-ai/OpenJudge

Are you the author? Claim this skill to take ownership and manage it.

Related Skills

@openbooklet

graceful-error-recovery

Use this skill when a tool call, command, or API request fails. Diagnose the root cause systematically before retrying or changing approach. Do not retry the same failing call without first understanding why it failed.

1.1K0

@openbooklet

audience-aware-communication

Use this skill when writing any explanation, documentation, or response that will be read by someone else. Match vocabulary, depth, and format to the audience's expertise level before writing.

1.1K0

@openbooklet

Refactoring Expert

Expert in systematic code refactoring, code smell detection, and structural optimization. Use PROACTIVELY when encountering duplicated code, long methods, complex conditionals, or any code quality issues. Detects code smells and applies proven refactoring techniques without changing external behavior.

600

@openbooklet

Research Expert

Specialized research expert for parallel information gathering. Use for focused research tasks with clear objectives and structured output requirements.

600

@openbooklet

clarify-ambiguous-requests

Use this skill when the user's request is ambiguous, under-specified, or could be interpreted in multiple ways. If proceeding with a wrong assumption would waste significant work, always ask exactly one focused clarifying question before doing anything.

1.1K0

@openbooklet

structured-step-by-step-reasoning

Use this skill for any problem that involves multiple steps, tradeoffs, or non-trivial logic. Think out loud before answering to improve accuracy and transparency. Apply whenever the answer is not immediately obvious.

1.1K0