Image Recognition
    Image Recognition

    Image Recognition

    An MCP server that provides image recognition 👀 capabilities using Anthropic and OpenAI vision APIs

    4.3

    GitHub Stats

    Stars

    18

    Forks

    7

    Release Date

    4/10/2025

    about 2 months ago

    Detailed Description

    MCP Image Recognition Server

    An MCP server that provides image recognition capabilities using Anthropic and OpenAI vision APIs. Version 0.1.2.

    Features

    • Image description using Anthropic Claude Vision or OpenAI GPT-4 Vision
    • Support for multiple image formats (JPEG, PNG, GIF, WebP)
    • Configurable primary and fallback providers
    • Base64 and file-based image input support
    • Optional text extraction using Tesseract OCR

    Requirements

    • Python 3.8 or higher
    • Tesseract OCR (optional) - Required for text extraction feature
      • Windows: Download and install from UB-Mannheim/tesseract
      • Linux: sudo apt-get install tesseract-ocr
      • macOS: brew install tesseract

    Installation

    1. Clone the repository:
    git clone https://github.com/mario-andreschak/mcp-image-recognition.git
    cd mcp-image-recognition
    
    1. Create and configure your environment file:
    cp .env.example .env
    # Edit .env with your API keys and preferences
    
    1. Build the project:
    build.bat
    

    Usage

    Running the Server

    Spawn the server using python:

    python -m image_recognition_server.server
    

    Start the server using batch instead:

    run.bat server
    

    Start the server in development mode with the MCP Inspector:

    run.bat debug
    

    Available Tools

    1. describe_image

      • Input: Base64-encoded image data and MIME type
      • Output: Detailed description of the image
    2. describe_image_from_file

      • Input: Path to an image file
      • Output: Detailed description of the image

    Environment Configuration

    • ANTHROPIC_API_KEY: Your Anthropic API key.
    • OPENAI_API_KEY: Your OpenAI API key.
    • VISION_PROVIDER: Primary vision provider (anthropic or openai).
    • FALLBACK_PROVIDER: Optional fallback provider.
    • LOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR).
    • ENABLE_OCR: Enable Tesseract OCR text extraction (true or false).
    • TESSERACT_CMD: Optional custom path to Tesseract executable.
    • OPENAI_MODEL: OpenAI Model (default: gpt-4o-mini). Can use OpenRouter format for other models (e.g., anthropic/claude-3.5-sonnet:beta).
    • OPENAI_BASE_URL: Optional custom base URL for the OpenAI API. Set to https://openrouter.ai/api/v1 for OpenRouter.
    • OPENAI_TIMEOUT: Optional custom timeout (in seconds) for the OpenAI API.

    Using OpenRouter

    OpenRouter allows you to access various models using the OpenAI API format. To use OpenRouter, follow these steps:

    1. Obtain an OpenAI API key from OpenRouter.
    2. Set OPENAI_API_KEY in your .env file to your OpenRouter API key.
    3. Set OPENAI_BASE_URL to https://openrouter.ai/api/v1.
    4. Set OPENAI_MODEL to the desired model using the OpenRouter format (e.g., anthropic/claude-3.5-sonnet:beta).
    5. Set VISION_PROVIDER to openai.

    Default Models

    • Anthropic: claude-3.5-sonnet-beta
    • OpenAI: gpt-4o-mini
    • OpenRouter: Use the anthropic/claude-3.5-sonnet:beta format in OPENAI_MODEL.

    Development

    Running Tests

    Run all tests:

    run.bat test
    

    Run specific test suite:

    run.bat test server
    run.bat test anthropic
    run.bat test openai
    

    Docker Support

    Build the Docker image:

    docker build -t mcp-image-recognition .
    

    Run the container:

    docker run -it --env-file .env mcp-image-recognition
    

    License

    MIT License - see LICENSE file for details.

    Release History

    • 0.1.2 (2025-02-20): Improved OCR error handling and added comprehensive test coverage for OCR functionality
    • 0.1.1 (2025-02-19): Added Tesseract OCR support for text extraction from images (optional feature)
    • 0.1.0 (2025-02-19): Initial release with Anthropic and OpenAI vision support

    Star History

    Star History

    Feb 26Mar 14Mar 24Apr 8May 3May 9Jun 7Jun 2805101520
    Powered by MSeeP Analytics

    About the Project

    This app has not been claimed by its owner yet.

    Claim Ownership

    Receive Updates

    Security Updates

    Get notified about trust rating changes

    to receive email notifications.