cuda-kernels
Provides guidance for writing and benchmarking optimized CUDA kernels for NVIDIA GPUs (H100, A100, T4) targeting HuggingFace diffusers and transformers libraries. Supports models like LTX-Video, Stable Diffusion, LLaMA, Mistral, and Qwen. Includes integration with HuggingFace Kernels Hub (get_kernel) for loading pre-compiled kernels. Includes benchmarking scripts to compare kernel performance against baseline implementations.
algorand-ecosystem
Catalog of major projects, protocols, and tools in the Algorand ecosystem. Use when the user asks about Algorand ecosystem projects, DeFi protocols (Folks Finance, Tinyman, Pact, Haystack, Vestige, AlphaArcade), wallets (Pera, Lute, Defly), bridges and cross-chain swaps (XO Swap, SimpleSwap, Allbridge, Wormhole NTT), blockchain explorers and dashboards (Allo, Algo Surf, Lora, Pera Explorer, Nodely, DeFi Llama), NFT marketplaces and tools (Downbad, Rand Gallery, Wen Tools, Minthol, NFDomains, GoPlausible), impact projects (AID Tech, HesabPay, Wholechain), or real world assets / RWA (Meld Gold, Lofty). Also use when the user wants to find relevant integrations, community projects, SDKs, or APIs available in the Algorand ecosystem.
lazyllm-skill
LazyLLM framework for building multi-agent AI applications. Use when task mentioned LazyLLM or AI program for: (1) Flow orchestration - linear, branching, parallel, loop workflows for complex data pipelines, (2) Model fine-tuning and acceleration - finetuning LLMs with LLaMA-Factory/Alpaca-LoRA/Collie and acceleration with vLLM/LMDeploy/LightLLM. Includes comprehensive code examples for all components, (3) RAG systems - knowledge-based QA with document retrieval, vectorization, and generation, (4) Agent development - single/multi-agent systems with tools, memory, planning, and web interfaces.
zettelforge
ZettelForge v2.0.0 â Production CTI agentic memory system. Hybrid TypeDB (STIX 2.1 ontology) + LanceDB (vector search). Zero external AI dependencies: fastembed for embeddings, llama-cpp-python for LLM. 75% accuracy on CTI queries, 18% on LOCOMO. Use when agents need persistent memory, threat intel retrieval, entity extraction, graph traversal, or RAG synthesis.
prompt-engineer
Transform rough prompts/ideas into production-ready LLM prompts. Use when crafting, refining, or optimizing prompts for any AI model (Claude, GPT, Llama, etc.) with advanced techniques like CoT, constitutional AI, RAG optimization.
Classify files according to specific rules
Invoke this skill BEFORE implementing any text/document classification task to learn the correct llama_cloud_services API usage. Required reading before writing classification code." Requires the llama_cloud_services package and LLAMA_CLOUD_API_KEY as an environment variable.
llama_cpp_canister-release
Create a new GitHub release for llama_cpp_canister