Claude KVM
MCP server — control remote desktops via VNC with a native Swift daemon and Apple Vision OCR
Agent-S - Autonomous GUI Agent
Agent-S is a powerful autonomous agent that can control your computer's graphical interface to complete complex tasks. It combines vision and action understanding to interact with any GUI element.
Piranha Vision & DevTools
BJJ video analysis — YOLO pose detection, AI technique analysis, and highlight reels.
Kaito Query Service
AI LLM with Gemini, MiniMax, Replicate, OpenRouter. Vision, search, code review. USDC on Base.
Photographi Mcp
Visual Intelligence Command Center: A Local Computer Vision Engine for Photo Libraries
Io.Github.PyJudge/Pdf4vllm
PDF reader for vision LLMs. Auto-detects text corruption and switches to image mode.
XRay-Vision
AI-powered codebase analysis — call graphs, security, dead code, complexity. 150+ tools.
Io.Github.AdonaiVera/Fiftyone Mcp Server
Control FiftyOne computer vision datasets through AI assistants using 80+ operators.
AI Analysis Guide
> AI/ML is the technology for extracting value from data. This skill systematically covers all aspects of AI analysis â from machine learning fundamentals, deep learning, natural language processing, and computer vision to practical model development workflows.
Image Recongnition Mcp
MCP server for AI-powered image recognition and description using OpenAI vision models.
aesthetic
Create aesthetically beautiful interfaces following proven design principles. Use when building UI/UX, analyzing designs from inspiration sites, generating design images, implementing visual hierarchy and color theory, adding micro-interactions, or creating design documentation. Integrate localized specialized skills (chrome-devtools, ImageMagick) with native vision intelligence to achieve premium aesthetic standards.
Mobile Device Mcp
AI control of Android/iOS devices. Screenshots, UI tree, AI vision, Flutter, video. 49 tools.
Mulmocast Vision
Easy and stylish presentation slide generator.
citation-check-skill
Vision-enabled verification gate with web search. Use when users want to (1) verify slides/reports/PDFs/images against authoritative online sources, (2) validate that citations actually exist and say what's claimed, (3) check charts/graphs/tables for accuracy, (4) audit AI-generated content in doc-only mode (no external knowledge). Two modes - search mode validates against web, doc-only mode ensures everything traces to provided documents. Supports content in any language.
Nature Vision Mcp
Identifies biological species, returning Latin names with confidence scores.
Image Recognition Mcp
MCP server for AI-powered image recognition and description using OpenAI vision models.
architect
Designs system architecture and selects technology stack based on vision analysis. Use after vision analysis for technical decisions. Triggers on: design architecture, select tech stack, choose framework.
Snapgrab
URL to screenshot with metadata. Python MCP server. Claude Vision optimized.