204 lines
6 KiB
Markdown
204 lines
6 KiB
Markdown
|
|
# Music Metadata Enricher
|
|||
|
|
|
|||
|
|
KI-gestützter Musik-Metadaten-Enricher für Jellyfin-Bibliotheken.
|
|||
|
|
|
|||
|
|
Analysiert Album-Verzeichnisse, vervollständigt Tags, besorgt Cover-Art und
|
|||
|
|
benennt Dateien optional nach einem einheitlichen Schema um — vollständig ohne
|
|||
|
|
API-Key nutzbar, mit optionaler Claude-KI für lückenhafte Metadaten.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Features
|
|||
|
|
|
|||
|
|
- **Lokale Analyse** — Verzeichnisname, Dateinamen, bestehende ID3/FLAC/M4A-Tags,
|
|||
|
|
Tracklist-Textdateien (.txt, .htm, .html)
|
|||
|
|
- **MusicBrainz-Lookup** — Textsuche + AcoustID-Fingerprinting (optional)
|
|||
|
|
- **Discogs-Fallback** — bei MusicBrainz-Misses
|
|||
|
|
- **Claude API** — Reasoning-Schritt für unklare / widersprüchliche Daten (optional)
|
|||
|
|
- **Cover-Art** — lokal → MusicBrainz Cover Art Archive → einbetten in MP3/FLAC/M4A
|
|||
|
|
- **Tag-Schreiben** — title, artist, album, albumartist, tracknumber, discnumber,
|
|||
|
|
date, genre, label (mutagen, ID3v2.4)
|
|||
|
|
- **Umbenennen** — `01 - Artist - Title.ext` / `2-07 - Artist - Title.ext` (Multi-CD)
|
|||
|
|
- **Backup** — Sicherungskopien vor jeder Änderung
|
|||
|
|
- **CSV-Report** — vollständiges Protokoll aller Änderungen
|
|||
|
|
- **Interaktiver / Auto-Modus** — mit Konfidenz-Schwellwert
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Voraussetzungen
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
pip install mutagen musicbrainzngs pyacoustid discogs_client \
|
|||
|
|
anthropic Pillow requests tqdm
|
|||
|
|
sudo apt install libchromaprint-tools # fpcalc für AcoustID-Fingerprinting
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
| Paket | Zweck | Optional? |
|
|||
|
|
|-------|-------|-----------|
|
|||
|
|
| `mutagen` | Tags lesen/schreiben | nein |
|
|||
|
|
| `musicbrainzngs` | MusicBrainz-API | ja |
|
|||
|
|
| `pyacoustid` | AcoustID-Fingerprinting | ja |
|
|||
|
|
| `discogs_client` | Discogs-API | ja |
|
|||
|
|
| `anthropic` | Claude API | ja |
|
|||
|
|
| `Pillow` | Cover-Bildgröße prüfen | ja |
|
|||
|
|
| `requests` | Cover-Art-Download | ja |
|
|||
|
|
| `tqdm` | Fortschrittsbalken | ja |
|
|||
|
|
| `fpcalc` | Audio-Fingerprinting-Binary | ja |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Umgebungsvariablen (optional)
|
|||
|
|
|
|||
|
|
| Variable | Beschreibung |
|
|||
|
|
|----------|-------------|
|
|||
|
|
| `ANTHROPIC_API_KEY` | Claude API (Reasoning für Metadaten-Lücken) |
|
|||
|
|
| `ACOUSTID_API_KEY` | AcoustID-Fingerprinting |
|
|||
|
|
| `DISCOGS_TOKEN` | Discogs-API |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## CLI
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
music_enricher.py [Optionen] PFAD [PFAD ...]
|
|||
|
|
music_enricher.py --album PFAD [Optionen]
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
| Option | Beschreibung |
|
|||
|
|
|--------|-------------|
|
|||
|
|
| `--dry-run` | Vorschläge anzeigen, nichts schreiben |
|
|||
|
|
| `--auto` | Kein interaktiver Review-Schritt |
|
|||
|
|
| `--confidence FLOAT` | Min-Konfidenz für `--auto` (default: 0.85) |
|
|||
|
|
| `--rename` | Dateien nach Schema umbenennen |
|
|||
|
|
| `--embed-cover` | Cover-Art in Audiodatei einbetten |
|
|||
|
|
| `--backup PFAD` | Backup-Verzeichnis vor Änderungen |
|
|||
|
|
| `--report PFAD` | CSV-Report der Änderungen |
|
|||
|
|
| `--no-fingerprint` | AcoustID-Fingerprinting überspringen |
|
|||
|
|
| `--no-api` | Keine externen API-Calls |
|
|||
|
|
| `--no-cover` | Kein Cover-Art-Download |
|
|||
|
|
| `--album PFAD` | Einzelnes Album verarbeiten |
|
|||
|
|
| `--no-tqdm` | Fortschrittsanzeige deaktivieren |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Verwendung
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# Einzelnes Album — Dry-Run, kein API-Key nötig
|
|||
|
|
python3 music_enricher.py --dry-run --no-api \
|
|||
|
|
--album ~/Musik/Abba_-_Greatest_Hits
|
|||
|
|
|
|||
|
|
# Mit MusicBrainz-Lookup
|
|||
|
|
python3 music_enricher.py --dry-run \
|
|||
|
|
--album ~/Musik/Bach_Organ_-_Peter_Hurford
|
|||
|
|
|
|||
|
|
# Vollständig: Tags + Cover einbetten + umbenennen, mit Backup und Report
|
|||
|
|
python3 music_enricher.py --embed-cover --rename \
|
|||
|
|
--backup /tmp/musik_backup \
|
|||
|
|
--report report.csv \
|
|||
|
|
~/Musik
|
|||
|
|
|
|||
|
|
# Auto-Modus: nur Vorschläge ≥ 90% Konfidenz anwenden
|
|||
|
|
python3 music_enricher.py --auto --confidence 0.90 \
|
|||
|
|
--embed-cover --backup /tmp/backup \
|
|||
|
|
~/Musik
|
|||
|
|
|
|||
|
|
# Mit Claude API (ANTHROPIC_API_KEY setzen)
|
|||
|
|
export ANTHROPIC_API_KEY=sk-ant-...
|
|||
|
|
python3 music_enricher.py --dry-run --album ~/Musik/UnbekanntesAlbum
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Verarbeitungs-Pipeline
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
Album-Verzeichnis
|
|||
|
|
│
|
|||
|
|
▼
|
|||
|
|
1. AlbumScanner — Dateitypen klassifizieren
|
|||
|
|
│
|
|||
|
|
▼
|
|||
|
|
2. HintExtractor — lokal, keine API
|
|||
|
|
├─ Verzeichnisname → Artist, Album, Jahr
|
|||
|
|
├─ Dateinamen → Tracknummer, Artist, Titel
|
|||
|
|
├─ ID3/FLAC-Tags → bestehende Werte
|
|||
|
|
└─ Tracklist .txt → Tracklisten parsen
|
|||
|
|
│
|
|||
|
|
▼
|
|||
|
|
3. MetadataResolver
|
|||
|
|
├─ AcoustID → MusicBrainz via Fingerprint
|
|||
|
|
├─ Textsuche → MusicBrainz-API
|
|||
|
|
├─ Discogs → Fallback
|
|||
|
|
└─ Claude API → Reasoning-Schritt (optional)
|
|||
|
|
│
|
|||
|
|
▼
|
|||
|
|
4. CoverHandler — lokal → MusicBrainz → einbetten
|
|||
|
|
│
|
|||
|
|
▼
|
|||
|
|
5. ReviewStep — interaktiv oder --auto mit Konfidenz
|
|||
|
|
│
|
|||
|
|
▼
|
|||
|
|
6. Executor — Backup → Tags → Cover → Umbenennen → Report
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Konfidenz-Modell
|
|||
|
|
|
|||
|
|
| Quelle | Bonus |
|
|||
|
|
|--------|-------|
|
|||
|
|
| AcoustID-Match ≥ 90% | +0.20 |
|
|||
|
|
| MusicBrainz via Fingerprint | +0.25 |
|
|||
|
|
| MusicBrainz-Texttreffer (Score/100) | +0.30 × Score |
|
|||
|
|
| Discogs | +0.15 |
|
|||
|
|
| Claude-Reasoning | +0.10 |
|
|||
|
|
| Lokale Hints (Ordner-/Dateiname) | +0.05 |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Dateinamen-Schema (mit `--rename`)
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
01 - ABBA - Dancing Queen.mp3
|
|||
|
|
2-07 - Bach - Toccata And Fugue In D Minor BWV 565.flac
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Bei Single-Disc entfällt die Disc-Nummer. Bei Multi-CD-Alben (Unterordner `CD1/`, `CD2/` etc.)
|
|||
|
|
wird die Disc-Nummer automatisch erkannt.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Projektstruktur
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
Music_Metadata_Enricher/
|
|||
|
|
├── music_enricher.py Haupt-CLI, Pipeline-Orchestrierung
|
|||
|
|
├── models.py Dataclasses: AlbumScan, AlbumHints, TrackProposal, …
|
|||
|
|
├── scanner.py Dateisystem-Scanner, Typ-Klassifikation
|
|||
|
|
├── hint_extractor.py Dateiname/Tag/Tracklist-Auswertung
|
|||
|
|
├── metadata_resolver.py MusicBrainz + Discogs + Claude API
|
|||
|
|
├── cover_handler.py Cover-Art: Suche, Download, Einbettung
|
|||
|
|
├── executor.py Backup, Tag-Schreiben, Umbenennen, CSV-Report
|
|||
|
|
└── test_suite_enricher.py 17 Unit-/Integrationstests
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Tests
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
python3 test_suite_enricher.py
|
|||
|
|
# 📊 17/17 Tests erfolgreich
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Unterstützte Formate
|
|||
|
|
|
|||
|
|
| Format | Tags | Cover-Einbettung |
|
|||
|
|
|--------|------|-----------------|
|
|||
|
|
| MP3 | ID3v2.4 (EasyID3) | APIC-Frame |
|
|||
|
|
| FLAC | Vorbis-Comments | METADATA_BLOCK_PICTURE |
|
|||
|
|
| M4A/AAC | MP4-Tags | covr-Atom |
|
|||
|
|
| Sonstige | mutagen generic | – |
|