Skills

All Skills

audio

Skills tagged with #audio

@prism-php

developing-with-prism

Guide for developing with Prism PHP package - a Laravel package for integrating LLMs. Activate or use when working with Prism features including text generation, structured output, embeddings, image generation, audio processing, streaming, tools/function calling, or any LLM provider integration (OpenAI, Anthropic, Gemini, Mistral, Groq, XAI, DeepSeek, OpenRouter, Ollama, VoyageAI, ElevenLabs). Activate for any Prism-related development tasks.

prism-php/prism
18d ago
2.3K0
@XeldarAlz

audio

Unity audio system — AudioMixer groups, snapshots, spatial audio, audio source pooling, compression per platform.

XeldarAlz/everything-claude-unity+3 more
15d ago
50
@poco-ai

MiniMax Multi-Modal Toolkit

Generate voice, music, video, and image content via MiniMax APIs — the unified entry for **MiniMax multimodal** use cases (audio + music + video + image). Includes voice cloning & voice design for custom voices, image generation with character reference, and FFmpeg-based media tools for audio/vide

poco-ai/poco-claw
19d ago
1.3K0
@mcp-registry
MCP

SlideMaster

Create AI presentation videos with slides, narration, TTS audio, and MP4 export from any topic.

mcpgithubai
19d ago
0
@jwulff
MCP

Whisper Mcp

Local audio transcription using whisper.cpp. Transcribe with OpenAI Whisper models.

mcpgithubai
jwulff/whisper-mcp
19d ago
0
@jacob-bd

nlm-cli-skill

Expert guide for the NotebookLM CLI (`nlm`) - a command-line interface for Google NotebookLM. Use this skill when users want to interact with NotebookLM programmatically, including: creating/managing notebooks, adding sources (URLs, YouTube, text, Google Drive), generating content (podcasts, reports, quizzes, flashcards, mind maps, slides, infographics, videos, data tables), conducting research, chatting with sources, or automating NotebookLM workflows. Triggers on mentions of "nlm", "notebooklm", "notebook lm", "podcast generation", "audio overview", or any NotebookLM-related automation task.

jacob-bd/notebooklm-cli
19d ago
1460
@Noizefield

audio-plugin-coder

Noizefield/audio-plugin-coder+7 more
18d ago
2160
@mcp-registry
MCP

MixMake

Transcript-based audio editing: transcribe audio, edit by word ID, export edited audio.

mcpgithub
19d ago
0
@DoIT-Artificial-Intelligence

youtube-to-docs

(Kitchen Sink) Process a YouTube video with all features (summary, Q&A, infographic, audio, and video).

DoIT-Artificial-Intelligence/youtube-to-docs
18d ago
340
@ChanMeng666

audio-hooks

Use whenever the user asks to install, configure, snooze, mute, test, troubleshoot, or change settings for the claude-code-audio-hooks audio notification system. Trigger phrases include "audio hooks", "audio notifications", "snooze audio", "mute claude", "claude is too loud", "test audio", "switch audio theme", "rate limit alerts", "audio webhook", "TTS", "focus flow", and the slash command /audio-hooks. Also use when diagnosing why Claude Code is silent (or noisy) for the user.

ChanMeng666/claude-code-audio-hooks
18d ago
410
@xvirobotics

Doubao TTS — 豆包语音合成

Generate high-quality speech audio from text using Volcengine's Doubao TTS API. Supports short-form (real-time) and long-form (async, up to 100K characters) synthesis.

xvirobotics/metabot+5 more
18d ago
4580
@second-state

Qwen3 ASR — Voice Transcription

Transcribe speech from audio files to text.

second-state/qwen3_asr_rs
18d ago
1880
@gpu-bridge
MCP

Io.Github.Fjnunezp75/Gpu Bridge

30 GPU-powered AI services as MCP tools. LLM, image, video, audio, embeddings & more.

mcpgithubaillm
gpu-bridge/mcp-server
19d ago
0
@MRCalderon3D

2d-animation-pipeline

Define authoring, import, and state machine rules for frame-by-frame and skeletal 2D animations.

MRCalderon3D/everything-game-dev-code+42 more
18d ago
110
@msk3d0ut

higgsfield-ugc-prompt

Generate complete, detailed Higgsfield AI Marketing Studio UGC video prompts for product advertising. Use when the user wants to create a UGC video ad prompt for Higgsfield, mentions Higgsfield, wants a marketing video prompt, or provides product/shop reference images and asks for a video prompt. Generates second-by-second prompts with full audio, camera, outfit, and character descriptions in English with Turkish dialogue.

msk3d0ut/claude-skill-ugc-prompt
9d ago
50
@Orange-Sky-Software-Inc
MCP

Io.Github.Matthew B Simpson/Echosaw

Media intelligence analysis for audio, video, and images via the Echosaw MCP server.

mcpgithub
Orange-Sky-Software-Inc/echosaw-com+1 more
19d ago
0
@ace-step

acestep

Use ACE-Step API to generate music, edit songs, and remix music. Supports text-to-music, lyrics generation, audio continuation, and audio repainting. Use this skill when users mention generating music, creating songs, music production, remix, or audio continuation.

ace-step/ACE-Step-1.5+5 more
19d ago
7.9K0
@nexu-io

Audio Jingle Skill

Three sub-modes. The active project's `audioKind` decides which one runs:

nexu-io/open-design+19 more
8d ago
8.5K0
@muinyc

Audio Transcription with Whisper

Transcribe audio files locally using faster-whisper (CPU, int8 quantization). Supports all common audio formats (wav, mp3, m4a, flac, ogg, webm).

muinyc/istota+21 more
14d ago
50
@iPlug3

check-codesign

Check macOS code signature, hardened runtime, entitlements, and notarization of audio plugin bundles (.vst3, .component, .clap, .app/.appex). Use when user says "check code signing", "check codesign", "check signature", "verify signing", "check notarization", "why won't plugin load", "hardened runtime", "check entitlements", or a plugin fails to load in a signed DAW.

iPlug3/audio-plugin-dev-skills+5 more
19d ago
470
@Agents365-ai

video-podcast-maker

Use when user provides a topic and wants an automated video podcast created - handles research, script writing, TTS audio synthesis, Remotion video creation, and final MP4 output with background music

Agents365-ai/video-podcast-maker
19d ago
1750
@cinience

alicloud-ai-audio-asr

Transcribe non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`). Use when converting recorded audio files to text, generating transcripts with timestamps, or documenting DashScope/OpenAI-compatible ASR request and response fields.

cinience/alicloud-skills+61 more
19d ago
3530
@ljagiello

ctf-forensics

Provides digital forensics and signal analysis techniques for CTF challenges. Use when analyzing disk images, memory dumps, event logs, network captures, cryptocurrency transactions, steganography, PDF analysis, Windows registry, Volatility, PCAP, Docker images, coredumps, side-channel power traces, DTMF audio spectrograms, packet timing analysis, CD audio disc images, or recovering deleted files and credentials.

ljagiello/ctf-skills+7 more
18d ago
630
@itsfabioroma

transcribee

Transcribe YouTube videos and local audio/video files with speaker diarization. Use when user asks to transcribe a YouTube URL, podcast, video, or audio file. Outputs clean speaker-labeled transcripts ready for LLM analysis.

itsfabioroma/transcribee
19d ago
1750
@mightyhuman101

seedance-prompt-en

Write effective prompts for Jimeng Seedance 2.0 multimodal AI video generation. Use when users want to create video prompts using text, images, videos, and audio inputs with the @ reference system. Covers camera movements, effects replication, video extension, editing, music beat-matching, e-commerce ads, short dramas, and educational content.

mightyhuman101/seedance2-skill
19d ago
20
@302ai

302ai-api-integration

ALWAYS use this skill when user needs ANY API functionality (AI models, image generation, video, audio, text processing, etc.). Automatically search 302.AI's 1400+ APIs and generate integration code. Use proactively whenever APIs or AI capabilities are mentioned.

302ai/302AI-API-Integration-Skill
19d ago
60
@jacob-bd

nlm-skill

Expert guide for the NotebookLM CLI (`nlm`) and MCP server - interfaces for Google NotebookLM. Use this skill when users want to interact with NotebookLM programmatically, including: creating/managing notebooks, adding sources (URLs, YouTube, text, Google Drive), generating content (podcasts, reports, quizzes, flashcards, mind maps, slides, infographics, videos, data tables), conducting research, chatting with sources, or automating NotebookLM workflows. Triggers on mentions of "nlm", "notebooklm", "notebook lm", "podcast generation", "audio overview", or any NotebookLM-related automation task.

jacob-bd/notebooklm-mcp-cli
19d ago
2.5K0
@tenequm

audio-quality-check

Analyze audio recording quality - echo detection, loudness, speech intelligibility, SNR, spectral analysis. Use when the user wants to check a recording's quality, detect echo or duplication in audio files, measure speech clarity, compare original vs processed audio, diagnose why a recording sounds bad, or analyze audio tracks from Blackbox or any call recording app. Triggers on audio quality, recording analysis, echo detection, check recording, sound quality, analyze audio, speech quality, PESQ, STOI, loudness, SNR, audio diagnostics, recording sounds bad, echo in recording, audio duplication.

tenequm/skills+25 more
9d ago
180
@mcp-registry
MCP

Audioscrape Audio Intelligence

The audio intelligence layer. Search podcast transcripts, speakers, and entities across 250K+ shows.

mcpsearch
19d ago
0
@BrightWayAI
MCP

Io.Github.BrightWayAI/Video Analyzer

Analyze videos: extract frames, transcribe audio, generate storyboard breakdowns.

mcpgithubai
BrightWayAI/video-analyzer
19d ago
0
@roomi-fields
MCP

NotebookLM MCP

Automate Google NotebookLM — Q&A with citations, audio, video, content generation

mcpgithub
roomi-fields/notebooklm-mcp
19d ago
0
@jwulff
MCP

Apple Voice Memo Mcp

Access Apple Voice Memos on macOS. List, get audio, extract and generate transcripts.

mcpgithub
jwulff/apple-voice-memo-mcp
19d ago
0
@Emily2040

seedance-20

Generate and direct cinematic AI videos with Seedance 2.0 (ByteDance/Dreamina/Jimeng). Covers text-to-video, image-to-video, video-to-video, and reference-to-video workflows with @Tag asset references, multi-character scenes, audio design, and post-processing. Use when making AI video, writing Seedance prompts, directing a scene, fixing generation errors, or building an AI short film, product ad, or music video.

ai-videofilmmakingbytedanceseedancemultimodallip-sync
Emily2040/seedance-2.0+23 more
19d ago
4360
@barefootford

analyze-video

Adds visual descriptions to transcripts by extracting and analyzing video frames with ffmpeg. Creates visual transcript with periodic visual descriptions of the video clip. Use when all files have audio transcripts present (transcript) but don't yet have visual transcripts created (visual_transcript).

barefootford/buttercut+3 more
19d ago
1810
@proyecto26

NotebookLM AI Plugin

Supports: - Chat with Notebook AI (source-grounded Q&A with citations) - Slide Deck generation (PDF/PPTX) - Audio Overview (M4A -- deep dive, brief, critique, debate formats) - Video Overview (MP4 -- classic, whiteboard, kawaii, anime, watercolor styles) - Mind Map (HTML) - Flashcards (HTML/JSON) -

proyecto26/notebooklm-ai-plugin
18d ago
120
@rsmdt
MCP

Multimodal

Multi-provider media generation — images, video, audio, and transcription via a unified interface

mcpgithub
rsmdt/multimodal-mcp
19d ago
0