You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Inference method auto-downgrading (vllm>ds, etc.), and make vllm package optional
Merging similar model methods into hf_model_mixin
Set torch_dtype='bfloat16' when bf16 is specified, etc. (bf16 is in FinetunerArguments but torch_dtype is in ModelArguments, thus cannot handle in __post_init__(). )
Note on multiple instances inference:
In vllm inference, the number of attn heads should be devisible by vllm tensor parallel size. If we have a 14 heads LLM, then the options for tp is 1&2 (7 will cause another division issue, but I just forget what that issue is).
Say we have 8 gpus, then to utilize these devices, multiple instances vllm inference is necessary (tp=1 -> 8 instances, and tp=2 -> 4 instances)
Also, same for rm inference, and any other inference pipelines.
This document includes the features in LMFlow's roadmap. We welcome any discuss or contribute to the specific features at related Issues/PRs. 🤗
Main Features
chatbot.py
upgrade Conversation_template #917Usability
vllm
package optionalhf_model_mixin
torch_dtype='bfloat16'
whenbf16
is specified, etc. (bf16
is inFinetunerArguments
buttorch_dtype
is inModelArguments
, thus cannot handle in__post_init__()
. )Bug fixes
model.generate()
with dsz3 [BUG] The text cannot be generated successfully during the Raft step #861merge_lora
lora with abs path mergingload_dataset
long data fix [Bug Fix] update load_dataset to support long data #878create_copied_dataclass
compatibility when python version >= 3.10 (kw_only
issue) [BUG]TypeError: Field.__init__() missing 1 required positional argument: 'kw_only' #903 [usability] deps streamlining #905Issues left over from history
use_accelerator
->use_accelerate
typo fix (with Accelerate support PR)model_args.use_lora
leads to truncation of the sequence, mentioned in [Feature] reward model inferencer and dpov2 aligner #867Documentation
The text was updated successfully, but these errors were encountered: