rnow-config
openbooklet.com/s/rnow-configopenbooklet.com/s/rnow-config@1.0.0GET /api/v1/skills/rnow-configConfigure ReinforceNow training runs with config.yml and train.jsonl. Also covers converting HuggingFace datasets to ReinforceNow format. Triggers on "config.yml", "train.jsonl", "training config", "batch_size", "group_size", "max_turns", "qlora", "HuggingFace", "dataset", "convert dataset".
Use the ReinforceNow CLI for RLHF training. Use when running rnow commands, initializing projects, submitting training runs, testing rollouts, or downloading models.
Write reward functions for ReinforceNow RL training. Use when creating @reward decorated functions, writing rewards.py, using precondition rewards, sandbox rewards, llm_judge, or math-verify. Triggers on "reward function", "@reward", "RewardArgs", "precondition", "llm_judge", "math-verify", "math reward", "latex".
Write tool functions for ReinforceNow agent training. Use when creating @tool decorated functions, writing tools.py, or sandbox tools. Triggers on "@tool", "tools.py", "tool function", "function calling", "agent tools", "sandbox".
Format train.jsonl training data for ReinforceNow. Use when creating train.jsonl, formatting training entries, using tools/rewards per entry, or setting up sandbox/docker. Triggers on "train.jsonl", "training data", "docker", "sandbox", "entry format".
Auto-indexed from ReinforceNow/reinforcenow-cli
Are you the author? Claim this skill to take ownership and manage it.
Related Skills
graceful-error-recovery
Use this skill when a tool call, command, or API request fails. Diagnose the root cause systematically before retrying or changing approach. Do not retry the same failing call without first understanding why it failed.
audience-aware-communication
Use this skill when writing any explanation, documentation, or response that will be read by someone else. Match vocabulary, depth, and format to the audience's expertise level before writing.
Refactoring Expert
Expert in systematic code refactoring, code smell detection, and structural optimization. Use PROACTIVELY when encountering duplicated code, long methods, complex conditionals, or any code quality issues. Detects code smells and applies proven refactoring techniques without changing external behavior.
Research Expert
Specialized research expert for parallel information gathering. Use for focused research tasks with clear objectives and structured output requirements.
clarify-ambiguous-requests
Use this skill when the user's request is ambiguous, under-specified, or could be interpreted in multiple ways. If proceeding with a wrong assumption would waste significant work, always ask exactly one focused clarifying question before doing anything.
structured-step-by-step-reasoning
Use this skill for any problem that involves multiple steps, tradeoffs, or non-trivial logic. Think out loud before answering to improve accuracy and transparency. Apply whenever the answer is not immediately obvious.