feat: /plan, /cancel, /continue, /discard + Context 262144 + KV-Cache q4_0
- Neue Befehle: /plan (Planungsmodus, nur PLAN.md), /cancel (Loop-Abbruch), /continue (Resume nach Unterbrechung), /discard (PLAN.md verwerfen) - contextWindow in models.json und llama.cpp-Servern: 131072 → 262144 - KV-Cache: q8_0 → q4_0 (weniger VRAM, passt zu 262k-Kontext auf 2× 3090) - parallel: 2 → 1 beim Coder (stabiler bei großem Kontext) - Optimize-Status mit ASCII-Fortschrittsbalken + Blocker-Preview - cancelRequested-Flag prüft nach jedem Loop-Schritt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
b19c189e2e
commit
4a31535b76
4 changed files with 126 additions and 26 deletions
|
|
@ -54,7 +54,7 @@
|
|||
"name": "Qwen3.6 27B Coder (llama.cpp :8001)",
|
||||
"reasoning": true,
|
||||
"input": ["text"],
|
||||
"contextWindow": 131072,
|
||||
"contextWindow": 262144,
|
||||
"maxTokens": 16384,
|
||||
"cost": {
|
||||
"input": 0,
|
||||
|
|
@ -82,7 +82,7 @@
|
|||
"name": "Qwen3.6 27B Judge (llama.cpp :8002)",
|
||||
"reasoning": true,
|
||||
"input": ["text"],
|
||||
"contextWindow": 131072,
|
||||
"contextWindow": 262144,
|
||||
"maxTokens": 8192,
|
||||
"cost": {
|
||||
"input": 0,
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue