Commit graph

12 commits

Author SHA1 Message Date
0ca05e91d4 Add --except PATTERN option and update documentation
- --except filters albums by directory name (glob or substring, repeatable)
- README.md: new options table entries, new cover sources, updated pipeline,
  corrected test count (33), added batch example
- BEDIENUNGSANLEITUNG.md: new options table, sections E (batch+except),
  F (--status), LASTFM_API_KEY env var, corrected test count

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 11:38:52 +02:00
388a9ffd08 Add --skip-complete: skip already-enriched albums in batch runs
- _album_is_complete(album_dir): checks cover presence + sampled tag quality
  (first/last/middle files); returns (bool, problems_list)
  Sampling strategy: covers first, last and up to 3 middle files to catch
  albums where only some tracks were tagged
- _print_status() now uses _album_is_complete() internally (DRY)
- --skip-complete flag: filters album_dirs before the main loop, prints
  how many were skipped upfront
- Typical batch command:
    python3 music_enricher.py --auto --confidence 0.1 --rename --embed-cover \
        --no-fingerprint --skip-complete ~/nvme2n1p7_home/Musik

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 09:05:51 +02:00
80472653b4 Add 4 new cover/tracklist sources: MB back cover, iTunes, Last.fm, Discogs tracklist
cover_handler.py:
- _download_image(): shared helper replaces duplicated download logic
- download_back_cover(): fetches back cover from MusicBrainz CAA (/back endpoint),
  saves as back.jpg; skips if already present
- _itunes_cover_url() / download_itunes_cover(): iTunes Search API (no auth),
  requests 600x600 artwork; fallback after Discogs
- _lastfm_cover_url() / download_lastfm_cover(): Last.fm album.getinfo
  (LASTFM_API_KEY env var); last cover fallback, skips placeholder images
- resolve_cover(): extended with iTunes → Last.fm fallback chain

metadata_resolver.py:
- _discogs_get_tracklist(): fetches full Discogs release via REST API,
  parses tracklist[] including heading-based disc detection
- _lastfm_tracklist(): fetches Last.fm album.getinfo tracks (LASTFM_API_KEY)
- resolve(): uses Discogs tracklist → Last.fm tracklist as fallback when
  MusicBrainz returns no tracks; LASTFM_API_KEY added to env var block

music_enricher.py:
- process_album(): calls download_back_cover() after execute_album() when MBID known

New cover priority:  local → MusicBrainz front → Discogs → iTunes → Last.fm
New tracklist priority: local → YouTube → MusicBrainz → Discogs → Last.fm → OCR
Test suite: 29 → 33 tests (all pass)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 08:55:17 +02:00
031e595ff7 Add Discogs as cover source fallback after MusicBrainz
- _discogs_cover_url(): searches Discogs database/search API by artist+album,
  returns primary image URL; uses DISCOGS_TOKEN if set, else anonymous
- download_discogs_cover(): downloads and saves as folder.jpg (PNG→JPEG via PIL)
- resolve_cover() priority: local → MusicBrainz → Discogs
- music_enricher: pass artist/album to resolve_cover() for Discogs lookup

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 08:44:04 +02:00
b54d83ecb5 Add --status flag: library health report (missing covers, bad tags)
Scans all album directories and reports:
- Albums without any cover image
- Albums where the first 3 audio files have missing/placeholder tags
  (title or artist empty, 'Unknown', 'AudioTrack')
Exits without writing anything.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 08:42:26 +02:00
aaa32622b2 Fix 'Unknown' track artists leaking from bad ID3 tags and classical schema
- hint_extractor: filter existing tags through _is_good() so 'Unknown',
  'Unknown Artist' etc. in existing ID3 tags don't override filename-parsed
  artist names
- executor: _is_classical() now returns False when track_artist is a placeholder
  ('unknown', 'unknown artist') — prevents pop albums from getting the
  Performer-Composer-Work filename schema
- executor/music_enricher: pass albumartist to _proposed_filename() so fallback
  works when track artist is missing; fix display to use albumartist fallback too

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 08:15:46 +02:00
701b05a75d Fix Jellyfin playlist integration and tracklist matching for single-CD albums
- hint_extractor: add _normalize_vertical_tracklist() to handle bare-number/
  title/duration format (Tufaranka-style tracklists)
- hint_extractor: fix level-1 tracklist match — allow disc_num=None (single-CD)
  by assuming disc=1; previously no tracklist title was ever applied to single-
  CD tracks because the guard required disc_num to be set
- music_enricher: register module in sys.modules before exec_module() so
  @dataclass definitions in jellyfin_playlist_generator work correctly

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 07:58:41 +02:00
776c977573 Recursive album discovery + Jellyfin Playlist Generator integration
scanner.py: collect_album_dirs() now recursively finds album dirs
- Dirs with audio files at root → album
- Dirs with disc subdirs (CD1/CD2) and no root audio → multi-CD album
- Container dirs without audio → recurse into subdirs

music_enricher.py:
- After execute_album(), auto-discovers jellyfin_playlist_generator.py
  in ../Jellyfin_Playlist_Generator/ (or via --playlist-generator PATH)
- Calls generate_playlist() directly via importlib — no subprocess,
  no destructive cleanup_all_playlists, targeted to the enriched album
- New --playlist-generator CLI option for custom generator path

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 07:07:55 +02:00
787803bb7b Fix file permissions after rebase (644 → 755)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 06:01:38 +02:00
40a2ef3fb6 Add OCR fallback via Ollama Vision for albums without tracklist text
hint_extractor: _ocr_back_cover() sends back/inlay images to Ollama Vision
  when no tracklist .txt/.htm/.nfo is present. Model priority:
  qwen3-vl:latest → minicpm-v:latest → deepseek-ocr:latest (configurable
  via OLLAMA_OCR_MODEL env var). Timeout 180s. OCR text is fed into the
  same _parse_tracklist() pipeline as regular text files.

music_enricher: extract_hints(use_ocr=not args.no_api) — OCR is skipped
  with --no-api to allow fully offline/fast runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 05:42:03 +02:00
8bd48cf166 Include albumartist in filename; remove Claude API from LLM chain
Filename schema now: TT - AlbumArtist - TrackArtist - Title when albumartist
differs from track artist (e.g. pianist vs. composer). Identical artist → old
two-part format unchanged.

metadata_resolver: removed Claude API fallback entirely from _claude_resolve.
Chain is now Ollama (local, free) → OpenRouter (DeepSeek V3, cheap) only.

music_enricher: updated status line and use_claude flag accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 05:42:03 +02:00
f7cf520dbe Initial implementation of Music Metadata Enricher
AI-powered per-album pipeline: scan → local hints → MusicBrainz/Discogs/Claude
resolve → cover art → interactive or auto review → tag write + rename + report.
All external dependencies optional; 17/17 unit tests passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 05:42:03 +02:00