- Change default text-LLM from llama3 (not installed) to gemma3:12b
- Increase LLM timeout from 120s to 300s (large models need longer)
- Add explicit multi-column layout instruction to vision prompt to
prevent skipping columns on dense CD back-cover tracklists
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Tesseract OCR fails on rotated/low-contrast CD back covers.
New vision_llm module sends images directly to qwen3-vl via
Ollama chat API, bypassing OCR entirely. Robust JSON extraction
handles thinking tags, markdown blocks, and empty responses.
CLI scan/process commands gain --vision flag.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>