Commit graph

71 commits

Author SHA1 Message Date
67b8653b3c Fix tagger and playlist to use underscore filename pattern
After the organizer was updated to use underscores in filenames,
the tagger (glob pattern "01 *") and playlist generator (pattern
"01 title.*") still used spaces and failed to find any files.
Updated both to use "01_*" / "01_title.*" patterns.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 23:34:16 +01:00
8a39f7c41e Sanitize filenames: replace spaces/punctuation with underscores
Replace all non-word characters (spaces, punctuation, brackets, etc.)
with underscores in track titles, album names and artist names.
Collapse consecutive underscores to one, strip leading/trailing ones.
Umlauts (ä, ö, ü, ß) and digits are preserved.

Also use underscore instead of space between track number and title.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 23:25:28 +01:00
f4e49a3df6 Add disc count validation before apply
Check audio file count vs JSON track count per disc before processing.
Aborts with a clear error showing which discs have discrepancies and
whether tracks are missing from the JSON or audio files are missing
from the directory.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 23:08:24 +01:00
b599c9eb8a Fix default model, increase timeout, improve multi-column prompt
- Change default text-LLM from llama3 (not installed) to gemma3:12b
- Increase LLM timeout from 120s to 300s (large models need longer)
- Add explicit multi-column layout instruction to vision prompt to
  prevent skipping columns on dense CD back-cover tracklists

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 22:56:02 +01:00
8765e991b0 Fix broken abcde command: build output_fmt before cmd construction
cmd[-2] was overwriting the -a action value instead of the -o format
value when -P was appended last. Now output_fmt is computed upfront and
the cmd list is built cleanly without post-hoc index manipulation.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 21:10:11 +01:00
430775adf8 Add live progress, progress bar and CDDB logging to ripper
- Replace capture_output=True with Popen+live streaming (_stream_abcde)
- Show track counter:  "Track 3/14  Title..."
- Show cdparanoia progress bar: [████████░░░░░░░░░░░░░░░░░░░░░░]  45.2%  12.3 MB
- Show CDDB album header and track list as they appear
- Show tagging progress: "Tagging 14/14"
- Print abcde command for full transparency
- Collect CDDB track lines while streaming for later parsing
- Log warnings when CDDB returns no data
- Print full renamed file list after ripping

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 20:25:30 +01:00
d0d64da12e Change default quality from medium to high
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 20:17:58 +01:00
f2d3684956 Fix file extraction: don't use abcde move, extract from temp dir ourselves
abcde's move action + OUTPUTFORMAT config failed because shell variables
like ${TRACKNUM} are evaluated immediately when the config is sourced.
Instead: skip move, search abcde's internal temp dir (abcde.XXXX/trackNN.flac)
and move files flat into output_dir ourselves.

- Replace _get_audio_files/_write_abcde_config with _extract_tracks()
- _rename_files() now matches track01.flac pattern (abcde naming)
- Fallback rename to 01.flac etc. when no CDDB data available

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 20:15:50 +01:00
096be97ba8 Fix three ripper bugs: input cleanup, file move, recursive search
- Add _clean_input(): strips ANSI escape codes (^[[D from arrow keys),
  control characters and surrounding quotes from user input
- Add _write_abcde_config(): writes temp abcde config with OUTPUTDIR
  and flat OUTPUTFORMAT so files land in the right directory
- Add 'move' action to abcde so encoded files are actually placed in
  OUTPUTDIR instead of staying in abcde's internal temp directory
- Change _get_audio_files() to use rglob() (recursive) so files in
  abcde subdirectories are found
- Improve error messages: include abcde output on failure

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 19:58:59 +01:00
3b6c37a32d Add work-in-progress warning to README and BEDIENUNGSANLEITUNG
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 17:47:41 +01:00
92af4eeb9c Add BEDIENUNGSANLEITUNG.md and update README.md
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 17:39:24 +01:00
851dbf3a46 Remove tests/ from repo, update .gitignore, improve ripper
- Remove tests/ directory from version control (added to .gitignore)
- Add .idea/ to .gitignore
- Ripper: CDDB lookup, non-interactive mode, English UI, file renaming
- Config: abcde format mapping, per-format quality options
- CLI: English help texts, new --no-cddb / --pipes / --parallel / --quality options

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 17:35:34 +01:00
8ecade5cdc Add --from-text mode and improve LLM parser robustness
- Add --from-text/-t option to scan and process commands for
  pre-formatted tracklists (e.g. from Perplexity)
- Refactor llm_parser to use Chat API instead of Generate API
- Reuse _extract_json() from vision_llm for robust JSON extraction
- Improve SYSTEM_PROMPT with strict rules (Various Artists, no
  invented years, no composer info in titles, /no_think)
- Remove format:"json" constraint that caused empty responses

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:26:58 +01:00
3d91614e66 Add testdata/ to .gitignore
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:35:35 +01:00
1753ab204f Add Vision-LLM mode for direct image-to-JSON extraction
Tesseract OCR fails on rotated/low-contrast CD back covers.
New vision_llm module sends images directly to qwen3-vl via
Ollama chat API, bypassing OCR entirely. Robust JSON extraction
handles thinking tags, markdown blocks, and empty responses.
CLI scan/process commands gain --vision flag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:35:05 +01:00
686c4317d1 Remove CLAUDE.md from version control
File is now in .gitignore and kept only locally.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:02:49 +01:00
3e073250ca Add project skeleton: CLI pipeline for CD digitization
Modular Python package with Typer CLI (scan/apply/process commands),
Pydantic data models, OCR via Tesseract, LLM-based tracklist parsing,
mutagen audio tagging, M3U playlist generation, and cover processing.
Includes 8 passing tests and ruff lint config.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:00:12 +01:00
225f6b3dbf initial commit 2026-02-15 01:00:12 +01:00
a55bd8eabb initial commit 2026-02-15 01:00:12 +01:00
c7d9a3f0dc initial commit 2026-02-15 01:00:12 +01:00
036678cd07 Initial commit 2026-02-15 00:53:30 +01:00