What node configuration should I use for bloom 175b? #2979

gaoxt1983 · 2023-03-09T05:29:53Z

gaoxt1983
Mar 9, 2023

For now, I'm using a one node configuration . The detail is listed above:

2xx unit of CPU( I think it is irrelevant:))
800G Memory
8 * A100

I run deepspeed inference with two commands(please ignore -t -p arguments):
deepspeed --num_gpus 4 test_deepspeed.py -t 256 -p test1.txt
deepspeed --num_gpus 8 test_deepspeed.py -t 256 -p test1.txt

in test_deepspeed.py, which code is like:

    model = AutoModelForCausalLM.from_pretrained(175b_path, torch_dtype=torch.float16)
    model = deepspeed.init_inference(
            model=model,      # Transformers模型
            tensor_parallel={'enabled': True, 'tp_size': ${num_gpus}},        # 模型并行数量
            dtype=torch.float16, # 权重类型(fp16),
            replace_method="auto",
            replace_with_kernel_inject="True"
            )

    return model

And then deepspeed have tried to run my script with 4 or 8 processes(based on num_gpus), and eventually stopped abnormally. which reason is due to lack of memory. (for single process, it is consumed more than 200GB).

So I want to know, which configuration of machine should I use? And for this script, which configuration of inference should I use?

(Sorry of my poor written English)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What node configuration should I use for bloom 175b? #2979

{{title}}

Replies: 0 comments

Select a reply

What node configuration should I use for bloom 175b? #2979

gaoxt1983 Mar 9, 2023

Replies: 0 comments

gaoxt1983
Mar 9, 2023