Skip to content

Bitsandbytes error while run train.py #76

@Ruiwen505

Description

@Ruiwen505

Hello,
I was trying to run the python script using following command:
python3 fsdp_qlora/train.py --world_size 2 --model_name cache/nvidia/Llama-3_3-Nemotron-49b --gradient_accumulation_steps 4 --batch_size 8 --context_length 512 --precision bf16 --train_type qlora --reentrant_checkpointing true --use_gradient_checkpointing true --use_cpu_offload false --use_activation_cpu_offload false --log_to wandb --dataset data/output_M_Cap_fulltext.json --save_model True --output_dir outputs/nemotron-49b-finetuned

However I got this error.

Could not load bitsandbytes native library: /lib64/libc.so.6: version GLIBC_2.34' not found (required by /home/it4i-chang505/.local/lib/python3.11/site-packages/bitsandbytes/libbitsandbytes_cuda128.so) Traceback (most recent call last): File "/home/it4i-chang505/.local/lib/python3.11/site-packages/bitsandbytes/cextension.py", line 85, in <module> lib = get_native_library() ^^^^^^^^^^^^^^^^^^^^ File "/home/it4i-chang505/.local/lib/python3.11/site-packages/bitsandbytes/cextension.py", line 72, in get_native_library dll = ct.cdll.LoadLibrary(str(binary_path)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/apps/all/Anaconda3/2024.02-1/lib/python3.11/ctypes/__init__.py", line 454, in LoadLibrary return self._dlltype(name) ^^^^^^^^^^^^^^^^^^^ File "/apps/all/Anaconda3/2024.02-1/lib/python3.11/ctypes/__init__.py", line 376, in __init__ self._handle = _dlopen(self._name, mode) ^^^^^^^^^^^^^^^^^^^^^^^^^ OSError: /lib64/libc.so.6: version GLIBC_2.34' not found (required by /home/it4i-chang505/.local/lib/python3.11/site-packages/bitsandbytes/libbitsandbytes_cuda128.so)

CUDA Setup failed despite CUDA being available. Please run the following command to get more information:

python -m bitsandbytes

Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/bitsandbytes-foundation/bitsandbytes/issues

Traceback (most recent call last):
File "/mnt/proj2/dd-24-61/chatbot/axolotl/fsdp_qlora/train.py", line 86, in
from transformers.models.llama.modeling_llama import (
ImportError: cannot import name 'LLAMA_ATTENTION_CLASSES' from 'transformers.models.llama.modeling_llama' (/home/it4i-chang505/.local/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py)

When I in terminal type python -m bitsandbytes:

Could not load bitsandbytes native library: /lib64/libc.so.6: version GLIBC_2.34' not found (required by /home/it4i-chang505/.local/lib/python3.11/site-packages/bitsandbytes/libbitsandbytes_cuda128.so) Traceback (most recent call last): File "/home/it4i-chang505/.local/lib/python3.11/site-packages/bitsandbytes/cextension.py", line 85, in <module> lib = get_native_library() ^^^^^^^^^^^^^^^^^^^^ File "/home/it4i-chang505/.local/lib/python3.11/site-packages/bitsandbytes/cextension.py", line 72, in get_native_library dll = ct.cdll.LoadLibrary(str(binary_path)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/proj2/dd-24-61/conda/conda_env/myenv/lib/python3.11/ctypes/__init__.py", line 454, in LoadLibrary return self._dlltype(name) ^^^^^^^^^^^^^^^^^^^ File "/mnt/proj2/dd-24-61/conda/conda_env/myenv/lib/python3.11/ctypes/__init__.py", line 376, in __init__ self._handle = _dlopen(self._name, mode) ^^^^^^^^^^^^^^^^^^^^^^^^^ OSError: /lib64/libc.so.6: version GLIBC_2.34' not found (required by /home/it4i-chang505/.local/lib/python3.11/site-packages/bitsandbytes/libbitsandbytes_cuda128.so)

CUDA Setup failed despite CUDA being available. Please run the following command to get more information:

python -m bitsandbytes

Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/bitsandbytes-foundation/bitsandbytes/issues

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++ BUG REPORT INFORMATION ++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
CUDA specs: CUDASpecs(highest_compute_capability=(8, 0), cuda_version_string='128', cuda_version_tuple=(12, 8))
PyTorch settings found: CUDA_VERSION=128, Highest Compute Capability: (8, 0).
To manually override the PyTorch CUDA version please see: https://github.com/TimDettmers/bitsandbytes/blob/main/docs/source/nonpytorchcuda.mdx
The directory listed in your path is found to be non-existent: 1;/opt/clmgr/sbin
The directory listed in your path is found to be non-existent: 2;/opt/clmgr/bin
The directory listed in your path is found to be non-existent: 2;/opt/sgi/sbin
The directory listed in your path is found to be non-existent: 2;/opt/sgi/bin
The directory listed in your path is found to be non-existent: 2;/apps/all/Anaconda3/2024.02-1/bin
The directory listed in your path is found to be non-existent: 1;/apps/all/Anaconda3/2024.02-1/condabin
The directory listed in your path is found to be non-existent: 1;/usr/share/Modules/bin
The directory listed in your path is found to be non-existent: 1;/usr/local/bin
The directory listed in your path is found to be non-existent: 4;/usr/bin
The directory listed in your path is found to be non-existent: 1;/usr/local/sbin
The directory listed in your path is found to be non-existent: 1;/usr/sbin
The directory listed in your path is found to be non-existent: 1;/opt/c3/bin
The directory listed in your path is found to be non-existent: 1;/sbin
The directory listed in your path is found to be non-existent: 2;/bin
The directory listed in your path is found to be non-existent: 2;/opt/slurm/bin
The directory listed in your path is found to be non-existent: 2;/opt/icommands
The directory listed in your path is found to be non-existent: 1;/mnt/proj2/dd-24-61/ollama/bin
The directory listed in your path is found to be non-existent: 1;/home/it4i-chang505/bin
The directory listed in your path is found to be non-existent: -DCMAKE_AR=/mnt/proj2/dd-24-61/conda/conda_env/myenv/bin/x86_64-conda-linux-gnu-ar -DCMAKE_CXX_COMPILER_AR=/mnt/proj2/dd-24-61/conda/conda_env/myenv/bin/x86_64-conda-linux-gnu-gcc-ar -DCMAKE_C_COMPILER_AR=/mnt/proj2/dd-24-61/conda/conda_env/myenv/bin/x86_64-conda-linux-gnu-gcc-ar -DCMAKE_RANLIB=/mnt/proj2/dd-24-61/conda/conda_env/myenv/bin/x86_64-conda-linux-gnu-ranlib -DCMAKE_CXX_COMPILER_RANLIB=/mnt/proj2/dd-24-61/conda/conda_env/myenv/bin/x86_64-conda-linux-gnu-gcc-ranlib -DCMAKE_C_COMPILER_RANLIB=/mnt/proj2/dd-24-61/conda/conda_env/myenv/bin/x86_64-conda-linux-gnu-gcc-ranlib -DCMAKE_LINKER=/mnt/proj2/dd-24-61/conda/conda_env/myenv/bin/x86_64-conda-linux-gnu-ld -DCMAKE_STRIP=/mnt/proj2/dd-24-61/conda/conda_env/myenv/bin/x86_64-conda-linux-gnu-strip -DCMAKE_BUILD_TYPE=Release
The directory listed in your path is found to be non-existent: -D_DEBUG -D_FORTIFY_SOURCE=2 -Og -isystem /mnt/proj2/dd-24-61/conda/conda_env/myenv/include
The directory listed in your path is found to be non-existent: -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-all -fno-plt -Og -g -Wall -Wextra -fvar-tracking-assignments -ffunction-sections -pipe -isystem /mnt/proj2/dd-24-61/conda/conda_env/myenv/include
The directory listed in your path is found to be non-existent: -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /mnt/proj2/dd-24-61/conda/conda_env/myenv/include -I/mnt/proj2/dd-24-61/conda/conda_env/myenv/targets/x86_64-linux/include -L/mnt/proj2/dd-24-61/conda/conda_env/myenv/targets/x86_64-linux/lib -L/mnt/proj2/dd-24-61/conda/conda_env/myenv/targets/x86_64-linux/lib/stubs
The directory listed in your path is found to be non-existent: -fvisibility-inlines-hidden -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /mnt/proj2/dd-24-61/conda/conda_env/myenv/include -I/mnt/proj2/dd-24-61/conda/conda_env/myenv/targets/x86_64-linux/include -L/mnt/proj2/dd-24-61/conda/conda_env/myenv/targets/x86_64-linux/lib -L/mnt/proj2/dd-24-61/conda/conda_env/myenv/targets/x86_64-linux/lib/stubs
The directory listed in your path is found to be non-existent: 1;/apps/modules/base
The directory listed in your path is found to be non-existent: 1;/apps/modules/bio
The directory listed in your path is found to be non-existent: 1;/apps/modules/cae
The directory listed in your path is found to be non-existent: 1;/apps/modules/chem
The directory listed in your path is found to be non-existent: 1;/apps/modules/compiler
The directory listed in your path is found to be non-existent: 1;/apps/modules/data
The directory listed in your path is found to be non-existent: 1;/apps/modules/debugger
The directory listed in your path is found to be non-existent: 1;/apps/modules/devel
The directory listed in your path is found to be non-existent: 1;/apps/modules/geo
The directory listed in your path is found to be non-existent: 1;/apps/modules/lang
The directory listed in your path is found to be non-existent: 1;/apps/modules/lib
The directory listed in your path is found to be non-existent: 1;/apps/modules/math
The directory listed in your path is found to be non-existent: 1;/apps/modules/mpi
The directory listed in your path is found to be non-existent: 1;/apps/modules/numlib
The directory listed in your path is found to be non-existent: 1;/apps/modules/perf
The directory listed in your path is found to be non-existent: 1;/apps/modules/phys
The directory listed in your path is found to be non-existent: 1;/apps/modules/system
The directory listed in your path is found to be non-existent: 1;/apps/modules/toolchain
The directory listed in your path is found to be non-existent: 1;/apps/modules/tools
The directory listed in your path is found to be non-existent: 1;/apps/modules/vis
The directory listed in your path is found to be non-existent: 1;/opt/cray/pe/craype-targets/1.4.0/modulefiles
The directory listed in your path is found to be non-existent: 1;/apps/all/intel-compilers/2023.2.1/modulefiles
The directory listed in your path is found to be non-existent: 1;/opt/cray/pe/modulefiles
The directory listed in your path is found to be non-existent: 1;/apps/all/Lmod/8.7.37/modulefiles/Linux
The directory listed in your path is found to be non-existent: 1;/apps/all/Lmod/8.7.37/modulefiles/Core
The directory listed in your path is found to be non-existent: 1;/apps/all/Lmod/8.7.37/lmod/lmod/modulefiles/Core
The directory listed in your path is found to be non-existent: /usr/libexec/openssh/gnome-ssh-askpass
The directory listed in your path is found to be non-existent: //10.48.3.2
The directory listed in your path is found to be non-existent: -ccbin=/mnt/proj2/dd-24-61/conda/conda_env/myenv/bin/x86_64-conda-linux-gnu-c++
The directory listed in your path is found to be non-existent: XALT/3.0.2
The directory listed in your path is found to be non-existent: -fvisibility-inlines-hidden -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-all -fno-plt -Og -g -Wall -Wextra -fvar-tracking-assignments -ffunction-sections -pipe -isystem /mnt/proj2/dd-24-61/conda/conda_env/myenv/include
The directory listed in your path is found to be non-existent: /opt/clmgr/man
The directory listed in your path is found to be non-existent: /opt/clmgr/man
The directory listed in your path is found to be non-existent: -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /mnt/proj2/dd-24-61/conda/conda_env/myenv/include -I/mnt/proj2/dd-24-61/conda/conda_env/myenv/targets/x86_64-linux/include -L/mnt/proj2/dd-24-61/conda/conda_env/myenv/targets/x86_64-linux/lib -L/mnt/proj2/dd-24-61/conda/conda_env/myenv/targets/x86_64-linux/lib/stubs
The directory listed in your path is found to be non-existent: /apps/all/Lmod/8.7.37/modulefiles/Linux
The directory listed in your path is found to be non-existent: /apps/all/Lmod/8.7.37/modulefiles/Core
The directory listed in your path is found to be non-existent: -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,--allow-shlib-undefined -Wl,-rpath,/mnt/proj2/dd-24-61/conda/conda_env/myenv/lib -Wl,-rpath-link,/mnt/proj2/dd-24-61/conda/conda_env/myenv/lib -L/mnt/proj2/dd-24-61/conda/conda_env/myenv/lib -L/mnt/proj2/dd-24-61/conda/conda_env/myenv/targets/x86_64-linux/lib -L/mnt/proj2/dd-24-61/conda/conda_env/myenv/targets/x86_64-linux/lib/stubs
The directory listed in your path is found to be non-existent: /apps/all/Lmod/8.7.37/modulefiles
The directory listed in your path is found to be non-existent: (/mnt/proj2/dd-24-61/conda/conda_env/myenv) [\u@\h.karolina \W]$
The directory listed in your path is found to be non-existent: //debuginfod.centos.org/
The directory listed in your path is found to be non-existent: /apps/all/XALT/3.0.2/etc
The directory listed in your path is found to be non-existent: /mnt/proj2/dd-24-61/conda/conda_env/myenv/etc/xml/catalog file
CUDA SETUP: WARNING! CUDA runtime files not found in any environmental path.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Checking that the library is importable and CUDA is callable...
Couldn't load the bitsandbytes library, likely due to missing binaries.
Please ensure bitsandbytes is properly installed.

For source installations, compile the binaries with cmake -DCOMPUTE_BACKEND=cuda -S ..
See the documentation for more details if needed.

Trying a simple check anyway, but this will likely fail...
Traceback (most recent call last):
File "/home/it4i-chang505/.local/lib/python3.11/site-packages/bitsandbytes/diagnostics/main.py", line 66, in main
sanity_check()
File "/home/it4i-chang505/.local/lib/python3.11/site-packages/bitsandbytes/diagnostics/main.py", line 40, in sanity_check
adam.step()
File "/home/it4i-chang505/.local/lib/python3.11/site-packages/torch/optim/optimizer.py", line 485, in wrapper
out = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/it4i-chang505/.local/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/it4i-chang505/.local/lib/python3.11/site-packages/bitsandbytes/optim/optimizer.py", line 291, in step
self.update_step(group, p, gindex, pindex)
File "/home/it4i-chang505/.local/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/it4i-chang505/.local/lib/python3.11/site-packages/bitsandbytes/optim/optimizer.py", line 521, in update_step
F.optimizer_update_32bit(
File "/home/it4i-chang505/.local/lib/python3.11/site-packages/bitsandbytes/functional.py", line 1572, in optimizer_update_32bit
optim_func = str2optimizer32bit[optimizer_name][0]
^^^^^^^^^^^^^^^^^^
NameError: name 'str2optimizer32bit' is not defined
Above we output some debug information.
Please provide this info when creating an issue via https://github.com/TimDettmers/bitsandbytes/issues/new/choose
WARNING: Please be sure to sanitize sensitive info from the output before posting it.

Could someone help to take a look please? Thank you in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions