developing-with-prism
Guide for developing with Prism PHP package - a Laravel package for integrating LLMs. Activate or use when working with Prism features including text generation, structured output, embeddings, image generation, audio processing, streaming, tools/function calling, or any LLM provider integration (OpenAI, Anthropic, Gemini, Mistral, Groq, XAI, DeepSeek, OpenRouter, Ollama, VoyageAI, ElevenLabs). Activate for any Prism-related development tasks.
audio
Unity audio system â AudioMixer groups, snapshots, spatial audio, audio source pooling, compression per platform.
MiniMax Multi-Modal Toolkit
Generate voice, music, video, and image content via MiniMax APIs â the unified entry for **MiniMax multimodal** use cases (audio + music + video + image). Includes voice cloning & voice design for custom voices, image generation with character reference, and FFmpeg-based media tools for audio/vide
SlideMaster
Create AI presentation videos with slides, narration, TTS audio, and MP4 export from any topic.
Whisper Mcp
Local audio transcription using whisper.cpp. Transcribe with OpenAI Whisper models.
nlm-cli-skill
Expert guide for the NotebookLM CLI (`nlm`) - a command-line interface for Google NotebookLM. Use this skill when users want to interact with NotebookLM programmatically, including: creating/managing notebooks, adding sources (URLs, YouTube, text, Google Drive), generating content (podcasts, reports, quizzes, flashcards, mind maps, slides, infographics, videos, data tables), conducting research, chatting with sources, or automating NotebookLM workflows. Triggers on mentions of "nlm", "notebooklm", "notebook lm", "podcast generation", "audio overview", or any NotebookLM-related automation task.
audio-plugin-coder
MixMake
Transcript-based audio editing: transcribe audio, edit by word ID, export edited audio.
youtube-to-docs
(Kitchen Sink) Process a YouTube video with all features (summary, Q&A, infographic, audio, and video).
audio-hooks
Use whenever the user asks to install, configure, snooze, mute, test, troubleshoot, or change settings for the claude-code-audio-hooks audio notification system. Trigger phrases include "audio hooks", "audio notifications", "snooze audio", "mute claude", "claude is too loud", "test audio", "switch audio theme", "rate limit alerts", "audio webhook", "TTS", "focus flow", and the slash command /audio-hooks. Also use when diagnosing why Claude Code is silent (or noisy) for the user.
Doubao TTS â è±å è¯é³åæ
Generate high-quality speech audio from text using Volcengine's Doubao TTS API. Supports short-form (real-time) and long-form (async, up to 100K characters) synthesis.
Qwen3 ASR â Voice Transcription
Transcribe speech from audio files to text.
Io.Github.Fjnunezp75/Gpu Bridge
30 GPU-powered AI services as MCP tools. LLM, image, video, audio, embeddings & more.
2d-animation-pipeline
Define authoring, import, and state machine rules for frame-by-frame and skeletal 2D animations.
higgsfield-ugc-prompt
Generate complete, detailed Higgsfield AI Marketing Studio UGC video prompts for product advertising. Use when the user wants to create a UGC video ad prompt for Higgsfield, mentions Higgsfield, wants a marketing video prompt, or provides product/shop reference images and asks for a video prompt. Generates second-by-second prompts with full audio, camera, outfit, and character descriptions in English with Turkish dialogue.
Io.Github.Matthew B Simpson/Echosaw
Media intelligence analysis for audio, video, and images via the Echosaw MCP server.
acestep
Use ACE-Step API to generate music, edit songs, and remix music. Supports text-to-music, lyrics generation, audio continuation, and audio repainting. Use this skill when users mention generating music, creating songs, music production, remix, or audio continuation.
Audio Jingle Skill
Three sub-modes. The active project's `audioKind` decides which one runs:
Audio Transcription with Whisper
Transcribe audio files locally using faster-whisper (CPU, int8 quantization). Supports all common audio formats (wav, mp3, m4a, flac, ogg, webm).
check-codesign
Check macOS code signature, hardened runtime, entitlements, and notarization of audio plugin bundles (.vst3, .component, .clap, .app/.appex). Use when user says "check code signing", "check codesign", "check signature", "verify signing", "check notarization", "why won't plugin load", "hardened runtime", "check entitlements", or a plugin fails to load in a signed DAW.
video-podcast-maker
Use when user provides a topic and wants an automated video podcast created - handles research, script writing, TTS audio synthesis, Remotion video creation, and final MP4 output with background music
alicloud-ai-audio-asr
Transcribe non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`). Use when converting recorded audio files to text, generating transcripts with timestamps, or documenting DashScope/OpenAI-compatible ASR request and response fields.
ctf-forensics
Provides digital forensics and signal analysis techniques for CTF challenges. Use when analyzing disk images, memory dumps, event logs, network captures, cryptocurrency transactions, steganography, PDF analysis, Windows registry, Volatility, PCAP, Docker images, coredumps, side-channel power traces, DTMF audio spectrograms, packet timing analysis, CD audio disc images, or recovering deleted files and credentials.
transcribee
Transcribe YouTube videos and local audio/video files with speaker diarization. Use when user asks to transcribe a YouTube URL, podcast, video, or audio file. Outputs clean speaker-labeled transcripts ready for LLM analysis.
seedance-prompt-en
Write effective prompts for Jimeng Seedance 2.0 multimodal AI video generation. Use when users want to create video prompts using text, images, videos, and audio inputs with the @ reference system. Covers camera movements, effects replication, video extension, editing, music beat-matching, e-commerce ads, short dramas, and educational content.
302ai-api-integration
ALWAYS use this skill when user needs ANY API functionality (AI models, image generation, video, audio, text processing, etc.). Automatically search 302.AI's 1400+ APIs and generate integration code. Use proactively whenever APIs or AI capabilities are mentioned.
nlm-skill
Expert guide for the NotebookLM CLI (`nlm`) and MCP server - interfaces for Google NotebookLM. Use this skill when users want to interact with NotebookLM programmatically, including: creating/managing notebooks, adding sources (URLs, YouTube, text, Google Drive), generating content (podcasts, reports, quizzes, flashcards, mind maps, slides, infographics, videos, data tables), conducting research, chatting with sources, or automating NotebookLM workflows. Triggers on mentions of "nlm", "notebooklm", "notebook lm", "podcast generation", "audio overview", or any NotebookLM-related automation task.
audio-quality-check
Analyze audio recording quality - echo detection, loudness, speech intelligibility, SNR, spectral analysis. Use when the user wants to check a recording's quality, detect echo or duplication in audio files, measure speech clarity, compare original vs processed audio, diagnose why a recording sounds bad, or analyze audio tracks from Blackbox or any call recording app. Triggers on audio quality, recording analysis, echo detection, check recording, sound quality, analyze audio, speech quality, PESQ, STOI, loudness, SNR, audio diagnostics, recording sounds bad, echo in recording, audio duplication.
Audioscrape Audio Intelligence
The audio intelligence layer. Search podcast transcripts, speakers, and entities across 250K+ shows.
Io.Github.BrightWayAI/Video Analyzer
Analyze videos: extract frames, transcribe audio, generate storyboard breakdowns.
NotebookLM MCP
Automate Google NotebookLM — Q&A with citations, audio, video, content generation
Apple Voice Memo Mcp
Access Apple Voice Memos on macOS. List, get audio, extract and generate transcripts.
seedance-20
Generate and direct cinematic AI videos with Seedance 2.0 (ByteDance/Dreamina/Jimeng). Covers text-to-video, image-to-video, video-to-video, and reference-to-video workflows with @Tag asset references, multi-character scenes, audio design, and post-processing. Use when making AI video, writing Seedance prompts, directing a scene, fixing generation errors, or building an AI short film, product ad, or music video.
analyze-video
Adds visual descriptions to transcripts by extracting and analyzing video frames with ffmpeg. Creates visual transcript with periodic visual descriptions of the video clip. Use when all files have audio transcripts present (transcript) but don't yet have visual transcripts created (visual_transcript).
NotebookLM AI Plugin
Supports: - Chat with Notebook AI (source-grounded Q&A with citations) - Slide Deck generation (PDF/PPTX) - Audio Overview (M4A -- deep dive, brief, critique, debate formats) - Video Overview (MP4 -- classic, whiteboard, kawaii, anime, watercolor styles) - Mind Map (HTML) - Flashcards (HTML/JSON) -
Multimodal
Multi-provider media generation — images, video, audio, and transcription via a unified interface