# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview **Musiksammlung** is a Python CLI tool that automates digitizing physical CD collections for use with Jellyfin. It orchestrates: CD ripping (via `abcde`), OCR of cover/back images (via Tesseract), LLM-based tracklist extraction, file renaming/tagging, and M3U playlist generation. ## Build & Development Commands ```bash pip install -e ".[dev]" # Install in editable mode with dev deps pytest tests/ -v # Run all tests pytest tests/test_models.py -v # Run a single test module ruff check src/ tests/ # Lint musiksammlung --help # CLI entry point ``` ## Architecture The pipeline flows: **OCR → LLM → Organize → Tag → Playlist** - `models.py` — Pydantic models (`Album`, `Disc`, `Track`) shared across all modules; the LLM JSON output validates directly into `Album` - `cli.py` — Typer CLI with three commands: `scan` (OCR+LLM→JSON), `apply` (JSON→files), `process` (full pipeline) - `ocr.py` — Tesseract wrapper with Pillow-based image preprocessing - `llm_parser.py` — Sends OCR text to LLM (Ollama or OpenAI-compatible), enforces JSON output, retries on parse failure - `organizer.py` — Builds source→target file mapping, handles single-disc and multi-disc layouts - `tagger.py` — Sets audio tags via mutagen (format-agnostic), optional cover embedding for FLAC/MP3 - `playlist.py` — Generates M3U playlists with relative paths - `ripper.py` — Drives `abcde` via subprocess for CD ripping - `cover.py` — Resizes/converts cover images to JPEG for Jellyfin ## Conventions - Python 3.11+, German variable names and comments are acceptable - Pydantic for data models, Typer for CLI, mutagen for audio tagging - External tools required at runtime: `tesseract`, `abcde` - The two-step workflow (`scan` → review JSON → `apply`) is the recommended default over the one-shot `process` command