Skip to content

Converting GGML->GGUF: ValueError: Only GGJTv3 supported #2990

Closed
@jboero

Description

@jboero

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • [X ] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • [ X] I carefully followed the README.md.
  • [X ] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [X ] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

My GGML converted models should be easy to convert to GGUF.
I know the conversion tools aren't guaranteed but I'd like to file this one in case anybody else has a workaround or more version flexible option. I would love to see any version of GGML/GGJT supported if possible. Instead my GGML files converted earlier are apparently not supported for conversion to GGUF.

Is there any tool to show the standard version details of a model file? Happy to contribute one if there isn't.

Current Behavior

python3 ./convert-llama-ggmlv3-to-gguf.py -i llama-2-70b/ggml-model-f32.bin -o test.gguf
=== WARNING === Be aware that this conversion script is best-effort. Use a native GGUF model if possible. === WARNING ===

* Scanning GGML input file
Traceback (most recent call last):
  File "[PATH]/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 353, in <module>
    main()
  File "[PATH]/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 335, in main
    offset = model.load(data, 0)
             ^^^^^^^^^^^^^^^^^^^
  File "[PATH]/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 125, in load
    offset += self.validate_header(data, offset)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[PATH]/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 121, in validate_header
    raise ValueError('Only GGJTv3 supported')
ValueError: Only GGJTv3 supported

Environment and Context

Working with models

  • Physical (or virtual) hardware you are using, e.g. for Linux:
    Physical Fedora 38, probably irrelevant give the Python.

$ lscpu

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  56
  On-line CPU(s) list:   0-55
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz
    CPU family:          6
    Model:               79
    Thread(s) per core:  2
    Core(s) per socket:  14
    Socket(s):           2
    Stepping:            1
    CPU(s) scaling MHz:  40%
    CPU max MHz:         3200.0000
    CPU min MHz:         1200.0000
    BogoMIPS:            3990.92
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts a
                         cpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_per
                         fmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes
                         64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_
                         2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowp
                         refetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd ibrs ibpb st
                         ibp tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bm
                         i2 erms invpcid rtm cqm rdt_a rdseed adx smap intel_pt xsaveopt cqm_llc cqm_occup_llc
                          cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts vnmi md_clear flush_l1d
Virtualization features: 
  Virtualization:        VT-x
Caches (sum of all):     
  L1d:                   896 KiB (28 instances)
  L1i:                   896 KiB (28 instances)
  L2:                    7 MiB (28 instances)
  L3:                    70 MiB (2 instances)
NUMA:                    
  NUMA node(s):          2
  NUMA node0 CPU(s):     0-13,28-41
  NUMA node1 CPU(s):     14-27,42-55
  • Operating System, e.g. for Linux:

$ uname -a
Linux z840 6.4.12-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Aug 23 17:46:49 UTC 2023 x86_64 GNU/Linux

  • SDK version, e.g. for Linux:
$ python3 --version
Python 3.11.4
$ make --version
$ g++ --version

Failure Information (for bugs)

python3 ./convert-llama-ggmlv3-to-gguf.py -i llama-2-70b/ggml-model-f32.bin -o test.gguf
=== WARNING === Be aware that this conversion script is best-effort. Use a native GGUF model if possible. === WARNING ===

* Scanning GGML input file
Traceback (most recent call last):
  File "[PATH]/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 353, in <module>
    main()
  File "[PATH]/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 335, in main
    offset = model.load(data, 0)
             ^^^^^^^^^^^^^^^^^^^
  File "[PATH]/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 125, in load
    offset += self.validate_header(data, offset)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[PATH]/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 121, in validate_header
    raise ValueError('Only GGJTv3 supported')
ValueError: Only GGJTv3 supported

Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

  1. step 1 convert any of the PTH models to GGML (using previous unversioned commits of convert)
  2. step 2 convert the GGML to GGUF with the command given above.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions