Musiksammlung/CLAUDE.md
dschlueter 488149b8f9 Track CLAUDE.md in repository
Remove CLAUDE.md from .gitignore and add it to version control
so project instructions are shared across all Claude sessions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 14:47:42 +01:00

4 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Musiksammlung is a Python CLI tool that automates digitizing physical CD collections for use with Jellyfin. It orchestrates: CD ripping (via abcde), phone-based back cover photo upload, Vision-LLM analysis, OCR of cover/back images (via Tesseract), LLM-based tracklist extraction, file renaming/tagging, and M3U playlist generation.

Build & Development Commands

pip install -e ".[dev]"          # Install in editable mode with dev deps
pytest tests/ -v                 # Run all tests
pytest tests/test_models.py -v   # Run a single test module
ruff check src/ tests/           # Lint
musiksammlung --help             # CLI entry point

Architecture

The pipeline flows: Rip → (Vision-LLM parallel) → Organize → Tag → Playlist

  • models.py — Pydantic models (Album, Disc, Track) shared across all modules; Album includes optional genre field; the LLM JSON output validates directly into Album
  • cli.py — Typer CLI with three commands: scan (OCR+LLM→JSON), apply (JSON→files), process (full pipeline); rip command accepts --vision-model, --vision-url, --scanner-port; scan supports five modes: --barcode (EAN→MB), --from-photo (photo→Vision-LLM→EAN→MB), --from-text (text→LLM), --vision (image→Vision-LLM), default (image→OCR→LLM)
  • ocr.py — Tesseract wrapper with Pillow-based image preprocessing
  • llm_parser.py — Sends OCR text to LLM (Ollama or OpenAI-compatible), enforces JSON output, retries on parse failure
  • organizer.py — Builds source→target file mapping, handles single-disc and multi-disc layouts
  • tagger.py — Sets audio tags via mutagen (format-agnostic), optional cover embedding for FLAC/MP3
  • playlist.py — Generates M3U playlists with relative paths
  • cddb.py — GnuDB/CDDB lookup via HTTP; returns CddbResult (tracks, artist, album, year from DYEAR, genre from DGENRE)
  • musicbrainz.py — MusicBrainz lookup by EAN/barcode; returns Album model
  • ripper.py — Drives abcde via subprocess; EAN-first interactive workflow (MusicBrainz auto-rip on hit, CDDB fallback on miss); scanner server starts at top of every album loop for EAN barcode photo and/or back cover upload; EAN can be typed or photographed (Vision-LLM reads the barcode); starts album Vision-LLM in background thread while ripping; extracts MBID from abcde temp dirs (abcde.*/mbid.N) for CAA cover download; outputs concrete copy-paste apply commands at the end
  • vision_llm.py — Vision-LLM: parse_image() extracts album metadata from back cover photos; extract_barcode_from_image() reads EAN/barcode digits from CD sleeve photos
  • scanner_server.py — Mini HTTP server (default port 8765) for phone-based photo upload; serves both EAN barcode scanning and back cover upload; mobile-friendly upload form; QR code displayed in terminal at start of every album; ScannerServer class + print_qr() helper
  • cover.py — Resizes/converts cover images to JPEG for Jellyfin

Vision-LLM Priority

Data sources are used with this priority (highest first):

  1. Vision-LLM — result from analysing back cover photo (phone upload or CAA download)
  2. MusicBrainz — structured metadata from EAN barcode lookup
  3. CDDB/GnuDB — fallback from disc fingerprint lookup

Conventions

  • Python 3.11+, German variable names and comments are acceptable
  • Pydantic for data models, Typer for CLI, mutagen for audio tagging
  • External tools required at runtime: tesseract, abcde
  • Firewall: port 8765 (TCP) must be open for phone scanner server (sudo ufw allow 8765/tcp)
  • The two-step workflow (rip → review JSON → apply) is the recommended default over the one-shot process command
  • sanitize_filename (organizer, used by ripper and apply): whitelist approach — spaces→_, keeps \w and hyphens, removes brackets and all other punctuation, collapses multiple underscores