-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Llama Factory微调报错 #807
Comments
应该是有些库没有装上,暂时推荐使用transformers==4.45.0,可以稳定跑通微调和推理 【重要】使用以下方式安装最新的llamafactory以及相应的库
|
transformers和hugginfface_hud都安装了你说的版本,运行还是报一样的错误哦 |
pip list 结果发一下 |
拉取最新代码后执行一下这一行,有一些语音依赖没装上 pip install -e ".[torch,metrics,deepspeed,minicpm_v]" |
@mkygogo 能跑起来了吗? |
不行哦,我是拉的llamafactory最新的代码,问题一样的,之前安装就是按照你们文档里的pip install -e ".[torch,metrics,deepspeed,minicpm_v]",全部重新来一遍还是不行,貌似安装的时候有个warning,意思不支持minicpm-v。可能要等llamafactory更新吧 |
已经有很多人都微调成功了哟,你的pip list里没有torchaudio,说明没安装成功,有可能是环境嵌套,你试一下python -m pip install以下这些库,
|
参考readme中微调的方法,微调报错:
Setting num_proc from 16 back to 1 for the train split to disable multiprocessing as it only contains one shard.
Generating train split: 6 examples [00:00, 644.04 examples/s]
num_proc must be <= 6. Reducing num_proc to 6 for dataset of size 6.
Converting format of dataset (num_proc=6): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 44.81 examples/s]
num_proc must be <= 6. Reducing num_proc to 6 for dataset of size 6.
Running tokenizer on dataset (num_proc=6): 0%| | 0/6 [00:01<?, ? examples/s]
multiprocess.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/root/miniconda3/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 678, in _write_generator_to_queue
for i, result in enumerate(func(**kwargs)):
File "/root/miniconda3/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3476, in _map_single
batch = apply_function_on_filtered_inputs(
File "/root/miniconda3/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3338, in apply_function_on_filtered_inputs
processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)
File "/root/autodl-tmp/llamafactory/LLaMA-Factory-main/src/llamafactory/data/processors/supervised.py", line 107, in preprocess_supervised_dataset
input_ids, labels = _encode_supervised_example(
File "/root/autodl-tmp/llamafactory/LLaMA-Factory-main/src/llamafactory/data/processors/supervised.py", line 48, in _encode_supervised_example
messages = template.mm_plugin.process_messages(prompt + response, images, videos, processor)
File "/root/autodl-tmp/llamafactory/LLaMA-Factory-main/src/llamafactory/data/mm_plugin.py", line 433, in process_messages
image_processor: "BaseImageProcessor" = getattr(processor, "image_processor")
AttributeError: 'NoneType' object has no attribute 'image_processor'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/root/miniconda3/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/root/autodl-tmp/llamafactory/LLaMA-Factory-main/src/llamafactory/cli.py", line 112, in main
run_exp()
File "/root/autodl-tmp/llamafactory/LLaMA-Factory-main/src/llamafactory/train/tuner.py", line 92, in run_exp
_training_function(config={"args": args, "callbacks": callbacks})
File "/root/autodl-tmp/llamafactory/LLaMA-Factory-main/src/llamafactory/train/tuner.py", line 66, in _training_function
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/root/autodl-tmp/llamafactory/LLaMA-Factory-main/src/llamafactory/train/sft/workflow.py", line 51, in run_sft
dataset_module = get_dataset(template, model_args, data_args, training_args, stage="sft", **tokenizer_module)
File "/root/autodl-tmp/llamafactory/LLaMA-Factory-main/src/llamafactory/data/loader.py", line 269, in get_dataset
dataset = _get_preprocessed_dataset(
File "/root/autodl-tmp/llamafactory/LLaMA-Factory-main/src/llamafactory/data/loader.py", line 204, in _get_preprocessed_dataset
dataset = dataset.map(
File "/root/miniconda3/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 560, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3165, in map
for rank, done, content in iflatmap_unordered(
File "/root/miniconda3/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 718, in iflatmap_unordered
[async_result.get(timeout=0.05) for async_result in async_results]
File "/root/miniconda3/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 718, in
[async_result.get(timeout=0.05) for async_result in async_results]
File "/root/miniconda3/lib/python3.10/site-packages/multiprocess/pool.py", line 774, in get
raise self._value
AttributeError: 'NoneType' object has no attribute 'image_processor'
The text was updated successfully, but these errors were encountered: