-
Notifications
You must be signed in to change notification settings - Fork 61
Description
run the same script like before with v080
now v090
(usual: pip install libary)
2025-11-15 09:02:01 WARNING init.py L22: AutoScheme is currently supported only on Linux.
2025-11-15 09:02:05 WARNING init.py L22: AutoScheme is currently supported only on Linux.
Traceback (most recent call last):
File "", line 198, in run_module_as_main
File "", line 88, in run_code
File "c:\Users\xxx\Documents\python\autoround\venv\Scripts\auto_round_mllm.exe_main.py", line 2, in
ImportError: cannot import name 'run_mllm' from 'auto_round.main' (C:\Users\xxx\Documents\python\autoround\venv\Lib\site-packages\auto_round_main.py)
Prozess beendet mit Code: 1
auto_round.exe seems to work ...
it ss possible to show me partameter for q2ks ~14-32B models -> for my system 16GBVram + 64 RAM
oc i can wait 2hours, possible to get more output while calculating ...
THX
for Q2 set enable_alg_ext=True right?
# --- Importiere benötigte Module ---
from transformers import AutoModelForCausalLM, AutoTokenizer
from auto_round import AutoRound
#from auto_round_mllm import Autoround
from pathlib import Path
import torch
# --- Basisverzeichnis: Speicherort des Skripts ---
script_dir = Path(__file__).parent
# --- Modellverzeichnis definieren (absoluter Pfad) ---
model_dir = script_dir / "e:/NextCoder-7B/safetensors"
# --- Überprüfen, ob das Verzeichnis existiert ---
if not model_dir.exists():
print(f"❌ Der Pfad '{model_dir}' existiert nicht.")
elif not model_dir.is_dir():
print(f"⚠️ Der Pfad '{model_dir}' ist keine Verzeichnisstruktur.")
else:
print(f"📁 Inhalte des Verzeichnisses: {model_dir}\n")
files = [f for f in model_dir.iterdir() if f.is_file()]
if not files:
print("ℹ️ Keine Dateien im Verzeichnis gefunden.")
else:
for file in files:
print(f" - {file.name}")
# --- Modell laden ---
model = AutoModelForCausalLM.from_pretrained(model_dir, torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained(model_dir)
# --- AutoRound initialisieren ---
bits, group_size, sym = 2, 16, False
autoround = AutoRound(model, tokenizer, bits=bits, group_size=group_size, sym=sym, batch_size=4, nsamples=512, iters=0, low_gpu_mem_usage=False, disable_opt_rtn=True, enable_alg_ext=True)
# --- Output-Verzeichnis: im selben Ordner wie das Skript ---
output_dir = script_dir / "tmp_auto_next"
# Supported formats: "auto_round" (default), "auto_gptq", "auto_awq", "llm_compressor", "gguf:q4_k_m", "gguf:q2_k_s" etc.
# --- Quantisieren und speichern ---
autoround.quantize_and_save(output_dir, format='gguf:q2_k_s')