Vision AI can analyze screenshots of your web application to detect visual issues automatically. Send a screenshot URL to the API with a tailored prompt, and receive structured feedback about layout problems, blank areas, text readability, and more.
Analyze this screenshot. Verify the page layout: 1) Is the header present with navigation? 2) Is the main content area properly centered? 3) Are sidebars in the correct position? 4) Is the footer visible at the bottom? List any issues found.
Check this screenshot for broken or blank elements: 1) Are there any empty white rectangles where content should be? 2) Any broken image icons? 3) Missing text or placeholder text still visible? 4) Any areas that look incomplete? Report each issue found.
Evaluate text readability in this screenshot: 1) Is the text legible at normal reading distance? 2) Is there sufficient contrast between text and background? 3) Are fonts consistent across the page? 4) Is text properly aligned? 5) Any text overlap or truncation? Score 1-10 and list issues.
Analyze this mobile screenshot for responsive design: 1) Does the layout adapt to narrow width? 2) Is text readable without horizontal scroll? 3) Are buttons and links large enough to tap? 4) Are images properly sized? 5) Any content cutoff or overflow?
Check this screenshot for error states: 1) Any visible error messages or stack traces? 2) Are there 404 or 500 error indicators? 3) Any JavaScript error overlays? 4) Broken modal or dialog states? 5) Form validation errors that shouldn't appear? List anything suspicious.
Add visual QA to your CI pipeline. Capture a screenshot during your test suite, then send it to Vision AI for analysis.
#!/bin/bash
SCREENSHOT_URL=$1
PROMPT="Analyze this screenshot for visual issues. Reply PASS if no issues found, or list specific issues if any are detected."
RESULT=$(curl -s -X POST https://vision.kim8.s4s.host/api/analyze \
-H "Content-Type: application/json" \
-d "{
\"model\": \"google/gemma-3-27b-it\",
\"prompt\": \"$PROMPT\",
\"image\": \"$SCREENSHOT_URL\",
\"temperature\": 0.2,
\"max_tokens\": 500
}")
echo "$RESULT"
SUCCESS=$(echo "$RESULT" | jq -r '.success')
if [ "$SUCCESS" != "true" ]; then
echo "Vision AI check failed"
exit 1
fi
CONTENT=$(echo "$RESULT" | jq -r '.data.content')
if echo "$CONTENT" | grep -qi "PASS"; then
echo "Visual QA: PASSED"
exit 0
else
echo "Visual QA: ISSUES FOUND"
echo "$CONTENT"
exit 1
fi
name: Visual QA
on: [push, pull_request]
jobs:
visual-qa:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Take screenshot
run: |
# Use your preferred screenshot tool (Puppeteer, Playwright, etc.)
npx playwright screenshot --url=http://localhost:3000 screenshot.png
- name: Upload screenshot
run: |
# Upload to a public URL, then analyze
SCREENSHOT_URL=$(curl -s -F "file=@screenshot.png" https://your-upload.service/upload | jq -r '.url')
RESULT=$(curl -s -X POST https://vision.kim8.s4s.host/api/analyze \
-H "Content-Type: application/json" \
-d "{\"model\":\"google/gemma-3-27b-it\",\"prompt\":\"Check for visual issues. Reply PASS or list issues.\",\"image\":\"$SCREENSHOT_URL\",\"temperature\":0.2,\"max_tokens\":500}")
echo "$RESULT"
SUCCESS=$(echo "$RESULT" | jq -r '.success')
if [ "$SUCCESS" != "true" ]; then exit 1; fi
if echo "$RESULT" | jq -r '.data.content' | grep -qi "PASS"; then
echo "Visual QA passed"
else
echo "Visual issues found"
exit 1
fi
Parse the API response in your scripts to take automated actions.
# Extract just the analysis content
curl -s -X POST https://vision.kim8.s4s.host/api/analyze \
-H "Content-Type: application/json" \
-d '{"model":"google/gemma-3-27b-it","prompt":"Describe","image":"https://example.com/img.png"}' \
| jq -r '.data.content'
# Check response time
curl -s ... | jq -r '.data.time_ms'
# Get token usage
curl -s ... | jq '.data.usage_prompt, .data.usage_completion'
$result = json_decode($response, true);
if ($result['success']) {
$content = $result['data']['content'];
$tokens = $result['data']['usage_prompt'] + $result['data']['usage_completion'];
$time = $result['data']['time_ms'];
echo "Content: $content\n";
echo "Tokens: $tokens, Time: ${time}ms\n";
}
temperature: 0.1-0.2 for more consistent, deterministic QA resultsgoogle/gemma-3-27b-it for highest quality analysisgoogle/gemma-3-4b-it for faster feedback in CI pipelines