dschlueter/Musiksammlung

Author	SHA1	Message	Date
dschlueter	984d8acc88	Fix album_dir path after in-place directory rename _rename_album_dir_inplace now returns the updated input_dir path (e.g. .../Golden_Oldies_Vol_11/CD1 instead of .../Album1/CD1). apply uses the return value so the final 'Fertig!' message and any subsequent operations reference the correct, post-rename path. Also fix CDDB header album name extraction (regex search between ---- markers instead of stripping leading/trailing dashes). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 10:44:09 +01:00
dschlueter	0c0311b00f	Fix CDDB album name extraction from header line The header line can have a prefix before the dashes, e.g.: "#1 (Musicbrainz): ---- Artist / Album ----" Use regex search for content between ---- markers instead of stripping leading/trailing dashes from the full line. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 10:32:35 +01:00
dschlueter	f902e50018	Remove idea/ directory from version control Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 10:12:34 +01:00
dschlueter	7ced974279	Add CDDB confirmation, cd-discid hint, CD-counter increment, check command - interactive_rip: after CDDB lookup, show album name + tracklist and ask 'Treffer korrekt? (j/n)' before renaming files; rip_disc gains rename=False option for deferred renaming - interactive_rip: CD number prompt now shows disc_counter as default instead of always showing [1] - _rip_with_abcde: when CDDB fails and cd-discid is not installed, print a visible hint with install command instead of silently doing nothing - _stream_abcde: extract album name from CDDB header line (---- DTITLE ----) and return it as part of the result tuple - _rename_files: early return when output directory does not exist - check command (cli.py): already present from previous session Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 10:02:01 +01:00
dschlueter	2f80cb2693	Remove MusicBrainz retry logic — HTTP 200 means no data, not transient error MusicBrainz always returns HTTP 200; an empty result set is definitive. Retrying would never yield a different outcome. - lookup_by_barcode(): retries parameter removed, random import removed - Removed 3 retry-related tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 09:48:07 +01:00
dschlueter	09c01c9370	Fix CDDB parser for compilations and add grab-progress fallback - _parse_cddb_lines now handles both 'Artist - Title' and 'Artist / Title' (slash separator used by abcde for compilation albums like Various Artists) - _stream_abcde collects grab-progress lines (track N: Artist / Title) as a fallback TrackInfo source when no CDDB lines are found - New _parse_grab_tracks() splits grab titles on ' / ' into artist+title - 5 new tests (TestParseCddbLines.test_compilation_slash_separator, TestParseGrabTracks.*) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 09:42:03 +01:00
dschlueter	e75e5d7de0	feat: GnuDB fallback with retries when abcde CDDB lookup returns nothing - New module cddb.py: direct GnuDB/FreeDB HTTP lookup using CDDB protocol, with same retry+random-delay logic as MusicBrainz barcode lookup - get_discid() reads disc fingerprint via cd-discid before ripping - If abcde returns no CDDB track data, lookup_by_discid() queries GnuDB directly (up to 3 retries, 2-6 s random pause between attempts) - TrackInfo moved from ripper.py to models.py to break circular import (cddb.py and ripper.py both use TrackInfo) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 07:24:16 +01:00
dschlueter	65164d428c	feat: retry MusicBrainz barcode lookup with random delay on empty result Up to 3 retries with 2–6 s random wait between attempts, as MusicBrainz occasionally returns no results on the first try. retries parameter is configurable (default: 3). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 07:16:32 +01:00
dschlueter	468fac6d2b	docs: document auto-rename of album directory in apply --in-place Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 06:53:24 +01:00
dschlueter	7554cade50	feat: auto-rename album directory after in-place apply After all file operations (rename, tag, cover, playlist), apply now renames the album root directory to match album.json metadata: - input_dir = CD1/CD2 etc.: parent directory is renamed automatically e.g. Kärntner_Doppelsextett/ → Du_Berührst_Mi_20_Jahre_Kärntner_Doppelsextett/ - input_dir = album root: a hint with the mv command is printed instead (avoids renaming an actively used path) - Existing directory with target name: warning, no rename Also: _sanitize_filename() in organizer.py made public (sanitize_filename), used consistently across organizer, playlist and cli modules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 06:49:17 +01:00
dschlueter	c0e4d2aa85	fix: show clean error message when MusicBrainz barcode lookup fails Catch ValueError (barcode not found) and httpx.HTTPError (network error) in _scan_to_album and print a user-friendly message with hint instead of a raw Python traceback. Also remove unused `call` import in test_ripper.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 06:18:12 +01:00
dschlueter	b70127e838	docs: document MusicBrainz barcode lookup in README and Bedienungsanleitung - README: Schnellstart shows --barcode as fastest option - Bedienungsanleitung: - Workflow diagram updated (EAN path, Varianten A-D) - Interactive rip example shows EAN prompt with MusicBrainz output - New Variante D: scan --barcode (no image, no OCR, no local LLM) - Variante C: corrected default model to qwen3-vl:235b-cloud - Tipps: barcode as first/fastest option, updated CDDB fallback hints Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 06:16:18 +01:00
dschlueter	b30aaa617d	Add MusicBrainz barcode lookup (scan --barcode and interactive rip) - New module musicbrainz.py: lookup_by_barcode() via EAN-13/UPC-12, two-step API (barcode search → release detail with recordings), respects 1 req/s rate limit with User-Agent header - cli.py: scan command gets --barcode option as highest-priority mode (no images needed); _scan_to_album() dispatches to MusicBrainz first - ripper.py: interactive_rip() prompts for optional EAN after album name; MusicBrainz data (incl. year) takes priority over CDDB for album.json; album_root.mkdir() added so JSON can be written even when MB changes dir - tests: test_musicbrainz.py (16 tests), test_ripper.py +6 barcode tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 06:13:10 +01:00
dschlueter	6aba30c0e5	Default Vision-Model auf qwen3-vl:235b-cloud gesetzt Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 05:56:44 +01:00
dschlueter	db47aa4456	Fix: Album.album akzeptiert null-Werte vom LLM Wenn das LLM keinen Albumtitel erkennt (z.B. nur Ensemblename auf dem Backcover), gibt es "album": null zurück. Statt mit ValidationError abzubrechen, wird null jetzt in "" konvertiert. Der Nutzer kann den leeren Titel in album.json manuell ergänzen. Geändert: - Album.album: str = "" (statt str ohne Default) - field_validator mode="before", None → "" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 05:49:37 +01:00
dschlueter	795be8609a	Opus/M4A-Cover-Embedding, cover.py-Tests und OCR-Tests - tagger.py: embed_cover() unterstützt jetzt .opus (Vorbis-Comment METADATA_BLOCK_PICTURE) und .m4a (MP4Cover); imports ergänzt - test_tagger.py: 2 neue Tests für Opus/M4A; minimale Audio-Fixtures als base64-Konstanten (176 B Opus, 856 B M4A) - test_cover.py: TestPrepareCover (5 Tests) und TestCopyCovers (6 Tests) für prepare_cover() und copy_covers() - test_ocr.py: 13 Tests für run_ocr(), _detect_and_fix_rotation() und ocr_images(); Tesseract via subprocess.run gemockt 144 Tests, 0 Fehler Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 04:50:13 +01:00
dschlueter	cfc2a2018e	Tagger- und CLI-Tests; Bugfix embed_cover für MP3 ohne ID3-Header - tests/test_tagger.py: 20 Tests für tag_file, tag_album, _scale_cover_for_embed, embed_cover (FLAC + MP3), embed_album_cover - tests/test_cli.py: 14 Tests für apply (in-place, disc-mismatch, dry-run, playlist, multi-disc), check und scan (via Mock) - tagger.py: embed_cover für MP3 fängt ID3NoHeaderError ab und erstellt einen neuen ID3-Tag wenn keiner vorhanden ist Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 04:37:07 +01:00
dschlueter	70c096cde4	Lint-Fixes, process-Disc-Validierung und Forgejo-CI - ruff: Import-Sortierung, unused imports, Zeilenlängen behoben - cli.py: _check_disc_counts_or_exit() extrahiert; auch process-Befehl prüft jetzt Disc-Anzahlen vor dem Umbenennen - .forgejo/workflows/ci.yml: ruff + pytest auf push/PR gegen main Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 00:51:14 +01:00
dschlueter	88b89fbb50	LLM-Parser-Tests, check-Befehl und Cover-Doku tests/test_llm_parser.py: 13 Tests für _call_ollama, _call_openai_compatible und parse_tracklist (Retry-Logik, Markdown-Block, Track-Artist, Mock) cli: neuer check-Befehl zeigt Tags und Cover-Status aller Audiodateien; ♪ markiert Dateien mit eingebettetem Cover BEDIENUNGSANLEITUNG: neuer Abschnitt 7 (check-Befehl), Cover-Konvention (frontcover.jpg/backcover.jpg, Embedding, 500px) in Schritt 3 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-18 00:45:49 +01:00
dschlueter	256be0ae33	Cover-Embedding: Auflösung auf 500px reduzieren vor dem Einbetten Neue Hilfsfunktion _scale_cover_for_embed() skaliert das Coverbild mit Pillow auf max. 500px (EMBED_COVER_MAX_SIZE) und kodiert es als JPEG quality=85 in-memory. embed_cover() liest nicht mehr die rohen Bytes der Originaldatei, sondern nutzt das skalierte Bild. Ergebnis: eingebettete Cover ~40–100 KB statt 200–500 KB des 1200px-Originals, auf HiDPI-Displays noch scharf erkennbar. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-18 00:40:11 +01:00
dschlueter	3fa6237f94	Cover-Embedding: frontcover.jpg/backcover.jpg als Standard-Konvention cover: find_cover() sucht frontcover.jpg/.png und backcover.jpg/.png; copy_covers() speichert als frontcover.jpg / backcover.jpg tagger: embed_album_cover() bettet Frontcover in alle Tracks ein cli: apply und process rufen embed_album_cover() nach copy_covers() auf tests: TestFindCover mit 7 Tests (jpg, png, Symlink, Priorität, Negativ) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-18 00:35:28 +01:00
dschlueter	c9152cf19f	Bedienungsanleitung aktualisieren: neues Dateinamen-Schema und CDDB→album.json - Workflow-Diagramm: CDDB speichert album.json automatisch - Rip-Ergebnis: korrektes Schema 01_-_Titel_-_Kuenstler.flac - apply-Ergebnisse: Dateinamen angepasst - album.json: optionales Track-artist-Feld erklärt - Dateinamen-Schema-Abschnitt: vollständige Beschreibung Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-18 00:26:09 +01:00
dschlueter	4e6d82a41d	Einheitliches Dateinamen-Schema: 01_-_Titel_-_Kuenstler.flac organizer: Separator vor Titel angleichen (war: 01_Titel_-_K., neu: 01_-_Titel_-_K.) playlist: Glob-Pattern und Fallback auf neues Schema angepasst tests: Assertions aktualisiert Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-18 00:22:54 +01:00
dschlueter	bafea5f335	CDDB→album.json + LLM-Prompts mit Track-Künstler ripper: nach erfolgreichem CDDB-Rip album.json im Album-Verzeichnis speichern (Artist, Titel, alle Discs mit Track-Künstlern) — Workflow- Lücke zwischen rip und apply geschlossen. llm_parser, vision_llm: Prompts erklären das optionale Track-artist- Feld; LLM setzt es nur wenn Track-Interpret vom Album-Künstler abweicht. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-18 00:16:44 +01:00
dschlueter	255496bd1b	Add per-track artist to filename: 09_Titel_-_Kuenstler.flac - Track model: add optional artist field (None = fall back to album artist) - organizer: append _-_<artist> to each filename - tagger: use track.artist over album.artist for the 'artist' tag - playlist: widen glob pattern to match new _-_<artist> suffix - tests: update assertions + add test for track-artist override Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-18 00:11:07 +01:00
dschlueter	775f274d02	Fix and expand tests: 63 tests passing, covers all core modules Remove tests/ from .gitignore (was accidentally excluded). - test_ripper.py: rewrite for current API (_parse_cddb_lines, _extract_tracks, _rename_files, _clean_input); fix default quality - test_organizer.py: update filename assertions (spaces→underscores); add TestSanitizeFilename, TestCheckDiscCounts, in-place mode - test_playlist.py: fix dummy filenames to underscore scheme; add multi-disc, filename sanitization, EXTINF and fallback tests Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-18 00:00:44 +01:00
dschlueter	734bc80b79	Update Bedienungsanleitung: in-place mode, Unterstrich-Schema, Disc-Validierung Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 23:52:17 +01:00
dschlueter	070a0573ae	Add --in-place mode to apply: rename and tag without moving files When no output_dir is given (or --in-place is set), files are renamed and tagged directly in the source directory instead of being moved into a separate Jellyfin library hierarchy. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 23:41:29 +01:00
dschlueter	67b8653b3c	Fix tagger and playlist to use underscore filename pattern After the organizer was updated to use underscores in filenames, the tagger (glob pattern "01 ") and playlist generator (pattern "01 title.") still used spaces and failed to find any files. Updated both to use "01_" / "01_title." patterns. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 23:34:16 +01:00
dschlueter	8a39f7c41e	Sanitize filenames: replace spaces/punctuation with underscores Replace all non-word characters (spaces, punctuation, brackets, etc.) with underscores in track titles, album names and artist names. Collapse consecutive underscores to one, strip leading/trailing ones. Umlauts (ä, ö, ü, ß) and digits are preserved. Also use underscore instead of space between track number and title. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 23:25:28 +01:00
dschlueter	f4e49a3df6	Add disc count validation before apply Check audio file count vs JSON track count per disc before processing. Aborts with a clear error showing which discs have discrepancies and whether tracks are missing from the JSON or audio files are missing from the directory. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 23:08:24 +01:00
dschlueter	b599c9eb8a	Fix default model, increase timeout, improve multi-column prompt - Change default text-LLM from llama3 (not installed) to gemma3:12b - Increase LLM timeout from 120s to 300s (large models need longer) - Add explicit multi-column layout instruction to vision prompt to prevent skipping columns on dense CD back-cover tracklists Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 22:56:02 +01:00
dschlueter	8765e991b0	Fix broken abcde command: build output_fmt before cmd construction cmd[-2] was overwriting the -a action value instead of the -o format value when -P was appended last. Now output_fmt is computed upfront and the cmd list is built cleanly without post-hoc index manipulation. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 21:10:11 +01:00
dschlueter	430775adf8	Add live progress, progress bar and CDDB logging to ripper - Replace capture_output=True with Popen+live streaming (_stream_abcde) - Show track counter: "Track 3/14 Title..." - Show cdparanoia progress bar: [████████░░░░░░░░░░░░░░░░░░░░░░] 45.2% 12.3 MB - Show CDDB album header and track list as they appear - Show tagging progress: "Tagging 14/14" - Print abcde command for full transparency - Collect CDDB track lines while streaming for later parsing - Log warnings when CDDB returns no data - Print full renamed file list after ripping Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 20:25:30 +01:00
dschlueter	d0d64da12e	Change default quality from medium to high Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 20:17:58 +01:00
dschlueter	f2d3684956	Fix file extraction: don't use abcde move, extract from temp dir ourselves abcde's move action + OUTPUTFORMAT config failed because shell variables like ${TRACKNUM} are evaluated immediately when the config is sourced. Instead: skip move, search abcde's internal temp dir (abcde.XXXX/trackNN.flac) and move files flat into output_dir ourselves. - Replace _get_audio_files/_write_abcde_config with _extract_tracks() - _rename_files() now matches track01.flac pattern (abcde naming) - Fallback rename to 01.flac etc. when no CDDB data available Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 20:15:50 +01:00
dschlueter	096be97ba8	Fix three ripper bugs: input cleanup, file move, recursive search - Add _clean_input(): strips ANSI escape codes (^[[D from arrow keys), control characters and surrounding quotes from user input - Add _write_abcde_config(): writes temp abcde config with OUTPUTDIR and flat OUTPUTFORMAT so files land in the right directory - Add 'move' action to abcde so encoded files are actually placed in OUTPUTDIR instead of staying in abcde's internal temp directory - Change _get_audio_files() to use rglob() (recursive) so files in abcde subdirectories are found - Improve error messages: include abcde output on failure Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 19:58:59 +01:00
dschlueter	3b6c37a32d	Add work-in-progress warning to README and BEDIENUNGSANLEITUNG Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 17:47:41 +01:00
dschlueter	92af4eeb9c	Add BEDIENUNGSANLEITUNG.md and update README.md Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 17:39:24 +01:00
dschlueter	851dbf3a46	Remove tests/ from repo, update .gitignore, improve ripper - Remove tests/ directory from version control (added to .gitignore) - Add .idea/ to .gitignore - Ripper: CDDB lookup, non-interactive mode, English UI, file renaming - Config: abcde format mapping, per-format quality options - CLI: English help texts, new --no-cddb / --pipes / --parallel / --quality options Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 17:35:34 +01:00
dschlueter	8ecade5cdc	Add --from-text mode and improve LLM parser robustness - Add --from-text/-t option to scan and process commands for pre-formatted tracklists (e.g. from Perplexity) - Refactor llm_parser to use Chat API instead of Generate API - Reuse _extract_json() from vision_llm for robust JSON extraction - Improve SYSTEM_PROMPT with strict rules (Various Artists, no invented years, no composer info in titles, /no_think) - Remove format:"json" constraint that caused empty responses Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 02:26:58 +01:00
dschlueter	3d91614e66	Add testdata/ to .gitignore Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 01:35:35 +01:00
dschlueter	1753ab204f	Add Vision-LLM mode for direct image-to-JSON extraction Tesseract OCR fails on rotated/low-contrast CD back covers. New vision_llm module sends images directly to qwen3-vl via Ollama chat API, bypassing OCR entirely. Robust JSON extraction handles thinking tags, markdown blocks, and empty responses. CLI scan/process commands gain --vision flag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 01:35:05 +01:00
dschlueter	686c4317d1	Remove CLAUDE.md from version control File is now in .gitignore and kept only locally. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 01:02:49 +01:00
dschlueter	3e073250ca	Add project skeleton: CLI pipeline for CD digitization Modular Python package with Typer CLI (scan/apply/process commands), Pydantic data models, OCR via Tesseract, LLM-based tracklist parsing, mutagen audio tagging, M3U playlist generation, and cover processing. Includes 8 passing tests and ruff lint config. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 01:00:12 +01:00
dschlueter	225f6b3dbf	initial commit	2026-02-15 01:00:12 +01:00
dschlueter	a55bd8eabb	initial commit	2026-02-15 01:00:12 +01:00
dschlueter	c7d9a3f0dc	initial commit	2026-02-15 01:00:12 +01:00
Dieter Schlüter	036678cd07	Initial commit	2026-02-15 00:53:30 +01:00

49 commits