Commit graph

30 commits

Author SHA1 Message Date
1ca88b0d6d Rename cover files: frontcover.jpg → front.jpg, backcover.jpg → back.jpg
Shorter, cleaner filenames consistent with Jellyfin conventions.
Updated all references in source, tests, and documentation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 09:56:12 +01:00
8c25bc65be Fix 6 bugs: shared stdin reader, CDDB multiline, type annotation, crash fixes
- ripper: replace per-call stdin daemon threads with a shared module-level
  reader (_stdin_queue + _read_line), preventing orphan threads from stealing
  stdin input after photo uploads; all 8 input() calls in interactive_rip
  now use _read_line()
- ripper: _stream_abcde return type annotation fixed (2-tuple → 3-tuple)
- ripper: disc retry rejection now breaks gracefully instead of raising
  unhandled RuntimeError that crashed the program
- ripper: int() on disc number input wrapped in try/except
- cddb: multi-line DTITLE/TTITLE values are now concatenated instead of
  only keeping the last line (per CDDB/xmcd protocol spec)
- cli: removed unreachable dead code block in apply command
- scanner_server: upload form auto-resets after 3s for repeated uploads
- tests: _scanner_patches() updated to mock _read_line alongside
  _input_or_scan (225 tests passing)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 19:03:06 +01:00
a32b0229f5 Add --from-photo to scan, retry in MB disc loop, temp/ to .gitignore
- scan: new --from-photo <img> option extracts EAN via Vision-LLM,
  then falls through to existing MusicBrainz barcode lookup
- ripper: MB disc loop now retries the same disc on rip failure instead
  of printing "Bitte Album neu starten"; user decline raises RuntimeError
- .gitignore: suppress temp/ directory
- tests: 4 new tests for scan --from-photo (225 total)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 14:43:37 +01:00
55c71823d1 Add tests for extract_barcode_from_image
9 new test cases covering: plain digit response, thinking-tag stripping,
digit extraction from surrounding text, empty/no-digit response → None,
exception handling → None, correct model/URL forwarding, EAN_PROMPT usage,
and base64 image encoding in request payload.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 14:34:59 +01:00
7135e681f8 Fix sanitize_filename consistency, add scanner server tests, remove stray file
- Unify sanitize_filename (organizer) and _sanitize_name (ripper): both now use
  whitelist approach — spaces→underscore, keep \w and hyphens, remove everything
  else (brackets, punctuation, commas, dots, …). _sanitize_name removed from
  ripper.py; ripper now imports sanitize_filename from organizer directly.
- Add tests/test_scanner_server.py: 15 tests covering HTTP GET/POST handlers,
  image upload queue, 404/400 error paths, _get_local_ip fallback, print_qr
  graceful degradation without qrcode installed.
- Delete empty stray file '3' from repo root.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 14:20:03 +01:00
32c84b9edb Add phone-based EAN scanning, scanner server for cover upload, Vision-LLM integration
New features:
- EAN/Barcode can now be entered by typing or by photographing the CD sleeve;
  Vision-LLM (extract_barcode_from_image) reads the barcode from the photo
- Scanner server (port 8765) starts at the beginning of every album loop,
  serving both EAN barcode scanning and back cover upload via QR code
- Vision-LLM analyses back cover in background thread while ripping;
  priority: Vision-LLM > MusicBrainz > CDDB
- _find_abcde_mbid reads MBID from abcde temp dirs for CAA cover download
  even when the CD barcode is not linked in MusicBrainz
- Concrete copy-paste apply commands shown after each album in 'Next steps'
- _sanitize_name: whitelist approach (removes brackets and punctuation)
- qrcode added as dependency for terminal QR code display

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 14:05:59 +01:00
8b449493cd EAN-first workflow in interactive_rip + GnuDB DYEAR/DGENRE parsing
EAN is now asked before the album name. On MusicBrainz hit, the ripper
enters an auto-rip flow (no album name prompt, no CDDB confirm, disc
count from MB data). On miss/empty EAN, the previous fallback flow
(album name → CDDB confirm) is preserved.

GnuDB responses now parse DYEAR and DGENRE fields into a new CddbResult
NamedTuple. Album model gains an optional genre field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 00:21:42 +01:00
de12afa67a Fix album.json landing in wrong directory
album.json was written to a separately computed album_root that could
differ from the actual disc_dir parent (e.g. when CDDB returned a
different album name). Now album.json is always saved in disc_dir.parent
where the audio files actually reside. Also adopts CDDB album name when
the user accepted the default name.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 23:34:35 +01:00
cf836b4528 Fix filename: omit empty artist suffix, sanitize single quotes
Regular (non-compilation) tracks had an empty artist producing
trailing '_-_.flac'. Now artist suffix is only appended when non-empty.
Also added single quote to _sanitize_name's removed characters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 23:25:26 +01:00
12bf67e977 Fix CDDB parser: only ' / ' splits artist/title, never ' - '
Classical titles like 'Sonate: I. Largo - Allegro' were incorrectly split
at the movement-separator dash, producing wrong artist/title pairs.
Now only ' / ' (CDDB compilation standard) is treated as artist-title
separator; ' - ' is always part of the title.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 22:45:38 +01:00
9e61b01f92 Fix M3U #EXTINF to include artist: 'Artist - Title' format
#EXTINF:0,Title was missing the artist, causing VLC and other players
to show only the track title without the performer. Standard M3U
extended format is '#EXTINF:<duration>,<Artist> - <Title>'.
Falls back to album.artist when no per-track artist is set.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 11:37:58 +01:00
7ced974279 Add CDDB confirmation, cd-discid hint, CD-counter increment, check command
- interactive_rip: after CDDB lookup, show album name + tracklist and ask
  'Treffer korrekt? (j/n)' before renaming files; rip_disc gains rename=False
  option for deferred renaming
- interactive_rip: CD number prompt now shows disc_counter as default
  instead of always showing [1]
- _rip_with_abcde: when CDDB fails and cd-discid is not installed, print
  a visible hint with install command instead of silently doing nothing
- _stream_abcde: extract album name from CDDB header line (---- DTITLE ----)
  and return it as part of the result tuple
- _rename_files: early return when output directory does not exist
- check command (cli.py): already present from previous session

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:02:01 +01:00
2f80cb2693 Remove MusicBrainz retry logic — HTTP 200 means no data, not transient error
MusicBrainz always returns HTTP 200; an empty result set is definitive.
Retrying would never yield a different outcome.

- lookup_by_barcode(): retries parameter removed, random import removed
- Removed 3 retry-related tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:48:07 +01:00
09c01c9370 Fix CDDB parser for compilations and add grab-progress fallback
- _parse_cddb_lines now handles both 'Artist - Title' and 'Artist / Title'
  (slash separator used by abcde for compilation albums like Various Artists)
- _stream_abcde collects grab-progress lines (track N: Artist / Title)
  as a fallback TrackInfo source when no CDDB lines are found
- New _parse_grab_tracks() splits grab titles on ' / ' into artist+title
- 5 new tests (TestParseCddbLines.test_compilation_slash_separator,
  TestParseGrabTracks.*)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:42:03 +01:00
e75e5d7de0 feat: GnuDB fallback with retries when abcde CDDB lookup returns nothing
- New module cddb.py: direct GnuDB/FreeDB HTTP lookup using CDDB protocol,
  with same retry+random-delay logic as MusicBrainz barcode lookup
- get_discid() reads disc fingerprint via cd-discid before ripping
- If abcde returns no CDDB track data, lookup_by_discid() queries GnuDB
  directly (up to 3 retries, 2-6 s random pause between attempts)
- TrackInfo moved from ripper.py to models.py to break circular import
  (cddb.py and ripper.py both use TrackInfo)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 07:24:16 +01:00
65164d428c feat: retry MusicBrainz barcode lookup with random delay on empty result
Up to 3 retries with 2–6 s random wait between attempts, as MusicBrainz
occasionally returns no results on the first try. retries parameter is
configurable (default: 3).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 07:16:32 +01:00
7554cade50 feat: auto-rename album directory after in-place apply
After all file operations (rename, tag, cover, playlist), apply now
renames the album root directory to match album.json metadata:

- input_dir = CD1/CD2 etc.: parent directory is renamed automatically
  e.g. Kärntner_Doppelsextett/ → Du_Berührst_Mi_20_Jahre_Kärntner_Doppelsextett/
- input_dir = album root: a hint with the mv command is printed instead
  (avoids renaming an actively used path)
- Existing directory with target name: warning, no rename

Also: _sanitize_filename() in organizer.py made public (sanitize_filename),
used consistently across organizer, playlist and cli modules.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 06:49:17 +01:00
c0e4d2aa85 fix: show clean error message when MusicBrainz barcode lookup fails
Catch ValueError (barcode not found) and httpx.HTTPError (network error)
in _scan_to_album and print a user-friendly message with hint instead of
a raw Python traceback. Also remove unused `call` import in test_ripper.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 06:18:12 +01:00
b30aaa617d Add MusicBrainz barcode lookup (scan --barcode and interactive rip)
- New module musicbrainz.py: lookup_by_barcode() via EAN-13/UPC-12,
  two-step API (barcode search → release detail with recordings),
  respects 1 req/s rate limit with User-Agent header
- cli.py: scan command gets --barcode option as highest-priority mode
  (no images needed); _scan_to_album() dispatches to MusicBrainz first
- ripper.py: interactive_rip() prompts for optional EAN after album name;
  MusicBrainz data (incl. year) takes priority over CDDB for album.json;
  album_root.mkdir() added so JSON can be written even when MB changes dir
- tests: test_musicbrainz.py (16 tests), test_ripper.py +6 barcode tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 06:13:10 +01:00
795be8609a Opus/M4A-Cover-Embedding, cover.py-Tests und OCR-Tests
- tagger.py: embed_cover() unterstützt jetzt .opus (Vorbis-Comment
  METADATA_BLOCK_PICTURE) und .m4a (MP4Cover); imports ergänzt
- test_tagger.py: 2 neue Tests für Opus/M4A; minimale Audio-Fixtures
  als base64-Konstanten (176 B Opus, 856 B M4A)
- test_cover.py: TestPrepareCover (5 Tests) und TestCopyCovers (6 Tests)
  für prepare_cover() und copy_covers()
- test_ocr.py: 13 Tests für run_ocr(), _detect_and_fix_rotation()
  und ocr_images(); Tesseract via subprocess.run gemockt

144 Tests, 0 Fehler

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 04:50:13 +01:00
cfc2a2018e Tagger- und CLI-Tests; Bugfix embed_cover für MP3 ohne ID3-Header
- tests/test_tagger.py: 20 Tests für tag_file, tag_album,
  _scale_cover_for_embed, embed_cover (FLAC + MP3), embed_album_cover
- tests/test_cli.py: 14 Tests für apply (in-place, disc-mismatch,
  dry-run, playlist, multi-disc), check und scan (via Mock)
- tagger.py: embed_cover für MP3 fängt ID3NoHeaderError ab und
  erstellt einen neuen ID3-Tag wenn keiner vorhanden ist

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 04:37:07 +01:00
70c096cde4 Lint-Fixes, process-Disc-Validierung und Forgejo-CI
- ruff: Import-Sortierung, unused imports, Zeilenlängen behoben
- cli.py: _check_disc_counts_or_exit() extrahiert; auch process-Befehl
  prüft jetzt Disc-Anzahlen vor dem Umbenennen
- .forgejo/workflows/ci.yml: ruff + pytest auf push/PR gegen main

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 00:51:14 +01:00
88b89fbb50 LLM-Parser-Tests, check-Befehl und Cover-Doku
tests/test_llm_parser.py: 13 Tests für _call_ollama, _call_openai_compatible
  und parse_tracklist (Retry-Logik, Markdown-Block, Track-Artist, Mock)

cli: neuer check-Befehl zeigt Tags und Cover-Status aller Audiodateien;
  ♪ markiert Dateien mit eingebettetem Cover

BEDIENUNGSANLEITUNG: neuer Abschnitt 7 (check-Befehl), Cover-Konvention
  (frontcover.jpg/backcover.jpg, Embedding, 500px) in Schritt 3

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 00:45:49 +01:00
3fa6237f94 Cover-Embedding: frontcover.jpg/backcover.jpg als Standard-Konvention
cover: find_cover() sucht frontcover.jpg/.png und backcover.jpg/.png;
  copy_covers() speichert als frontcover.jpg / backcover.jpg
tagger: embed_album_cover() bettet Frontcover in alle Tracks ein
cli: apply und process rufen embed_album_cover() nach copy_covers() auf
tests: TestFindCover mit 7 Tests (jpg, png, Symlink, Priorität, Negativ)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 00:35:28 +01:00
4e6d82a41d Einheitliches Dateinamen-Schema: 01_-_Titel_-_Kuenstler.flac
organizer: Separator vor Titel angleichen (war: 01_Titel_-_K., neu: 01_-_Titel_-_K.)
playlist: Glob-Pattern und Fallback auf neues Schema angepasst
tests: Assertions aktualisiert

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 00:22:54 +01:00
255496bd1b Add per-track artist to filename: 09_Titel_-_Kuenstler.flac
- Track model: add optional artist field (None = fall back to album artist)
- organizer: append _-_<artist> to each filename
- tagger: use track.artist over album.artist for the 'artist' tag
- playlist: widen glob pattern to match new _-_<artist> suffix
- tests: update assertions + add test for track-artist override

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 00:11:07 +01:00
775f274d02 Fix and expand tests: 63 tests passing, covers all core modules
Remove tests/ from .gitignore (was accidentally excluded).

- test_ripper.py: rewrite for current API (_parse_cddb_lines,
  _extract_tracks, _rename_files, _clean_input); fix default quality
- test_organizer.py: update filename assertions (spaces→underscores);
  add TestSanitizeFilename, TestCheckDiscCounts, in-place mode
- test_playlist.py: fix dummy filenames to underscore scheme;
  add multi-disc, filename sanitization, EXTINF and fallback tests

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 00:00:44 +01:00
851dbf3a46 Remove tests/ from repo, update .gitignore, improve ripper
- Remove tests/ directory from version control (added to .gitignore)
- Add .idea/ to .gitignore
- Ripper: CDDB lookup, non-interactive mode, English UI, file renaming
- Config: abcde format mapping, per-format quality options
- CLI: English help texts, new --no-cddb / --pipes / --parallel / --quality options

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 17:35:34 +01:00
1753ab204f Add Vision-LLM mode for direct image-to-JSON extraction
Tesseract OCR fails on rotated/low-contrast CD back covers.
New vision_llm module sends images directly to qwen3-vl via
Ollama chat API, bypassing OCR entirely. Robust JSON extraction
handles thinking tags, markdown blocks, and empty responses.
CLI scan/process commands gain --vision flag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:35:05 +01:00
3e073250ca Add project skeleton: CLI pipeline for CD digitization
Modular Python package with Typer CLI (scan/apply/process commands),
Pydantic data models, OCR via Tesseract, LLM-based tracklist parsing,
mutagen audio tagging, M3U playlist generation, and cover processing.
Includes 8 passing tests and ruff lint config.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:00:12 +01:00