← Help

Analyze a Screenshot in 3 Steps

Get started with Vision AI in under a minute

Step 1: Prepare Your Image

You need an image URL (publicly accessible) or a base64-encoded image.

The API accepts PNG, JPEG, WebP, and GIF formats.

Step 2: Send Your Request

Send a POST request to the analyze endpoint with your image and a prompt.

curl -X POST https://vision.kim8.s4s.host/api/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemma-3-27b-it",
    "prompt": "Describe what you see in this image.",
    "image": "https://example.com/screenshot.png",
    "temperature": 0.3,
    "max_tokens": 500
  }'

You can also use the web playground to upload and analyze without code.

Step 3: Read the Result

The API returns JSON with the analysis content, model used, token usage, and timing:

{
  "success": true,
  "message": "Analysis complete",
  "data": {
    "content": "The image shows a dashboard with...",
    "model": "google/gemma-3-27b-it",
    "usage_prompt": 128,
    "usage_completion": 85,
    "time_ms": 3421.5
  }
}

Extract the content: curl ... | jq -r '.data.content'

You're done!

You've analyzed your first screenshot. Now try multi-model comparison, OCR, or the screenshot QA workflow.

Next Steps

Multi-Model Send to 3 models at once by using "models": [...] instead of "model" Examples
OCR Extract text from images with a specific OCR prompt QA Guide
CI/CD Automate visual QA in your pipeline CI Integration
API Gateway Use the unified gateway (no auth needed) Gateway Docs

Choose Your Model

google/gemma-3-27b-itBest quality, best OCR
google/gemma-3-12b-itBalanced speed/quality
google/gemma-3-4b-itFastest responses
mistralai/Mistral-Small-3.2-24B-Instruct-2506Latest Mistral vision
mistralai/Magistral-Small-2509Newest Magistral

Quick Tips