dschlueter/Musiksammlung

Fork 0

Commit graph

Author	SHA1	Message	Date
dschlueter	8ecade5cdc	Add --from-text mode and improve LLM parser robustness - Add --from-text/-t option to scan and process commands for pre-formatted tracklists (e.g. from Perplexity) - Refactor llm_parser to use Chat API instead of Generate API - Reuse _extract_json() from vision_llm for robust JSON extraction - Improve SYSTEM_PROMPT with strict rules (Various Artists, no invented years, no composer info in titles, /no_think) - Remove format:"json" constraint that caused empty responses Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 02:26:58 +01:00
dschlueter	1753ab204f	Add Vision-LLM mode for direct image-to-JSON extraction Tesseract OCR fails on rotated/low-contrast CD back covers. New vision_llm module sends images directly to qwen3-vl via Ollama chat API, bypassing OCR entirely. Robust JSON extraction handles thinking tags, markdown blocks, and empty responses. CLI scan/process commands gain --vision flag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 01:35:05 +01:00
dschlueter	3e073250ca	Add project skeleton: CLI pipeline for CD digitization Modular Python package with Typer CLI (scan/apply/process commands), Pydantic data models, OCR via Tesseract, LLM-based tracklist parsing, mutagen audio tagging, M3U playlist generation, and cover processing. Includes 8 passing tests and ruff lint config. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 01:00:12 +01:00

Author

SHA1

Message

Date

dschlueter

8ecade5cdc

Add --from-text mode and improve LLM parser robustness

- Add --from-text/-t option to scan and process commands for
  pre-formatted tracklists (e.g. from Perplexity)
- Refactor llm_parser to use Chat API instead of Generate API
- Reuse _extract_json() from vision_llm for robust JSON extraction
- Improve SYSTEM_PROMPT with strict rules (Various Artists, no
  invented years, no composer info in titles, /no_think)
- Remove format:"json" constraint that caused empty responses

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-15 02:26:58 +01:00

dschlueter

1753ab204f

Add Vision-LLM mode for direct image-to-JSON extraction

Tesseract OCR fails on rotated/low-contrast CD back covers.
New vision_llm module sends images directly to qwen3-vl via
Ollama chat API, bypassing OCR entirely. Robust JSON extraction
handles thinking tags, markdown blocks, and empty responses.
CLI scan/process commands gain --vision flag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-15 01:35:05 +01:00

dschlueter

3e073250ca

Add project skeleton: CLI pipeline for CD digitization

Modular Python package with Typer CLI (scan/apply/process commands),
Pydantic data models, OCR via Tesseract, LLM-based tracklist parsing,
mutagen audio tagging, M3U playlist generation, and cover processing.
Includes 8 passing tests and ruff lint config.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-15 01:00:12 +01:00

3 commits