Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch & accelerate version #17

Open
Hope7Happiness opened this issue Jan 9, 2025 · 0 comments
Open

torch & accelerate version #17

Hope7Happiness opened this issue Jan 9, 2025 · 0 comments

Comments

@Hope7Happiness
Copy link

Hi, thank you for your great work!

I was running the code with torch 2.0.0 and accelerate 0.12.0. I have encountered two following errors:

  1. When running bash_scripts/run.sh with the first line:
accelerate launch --num_processes 1 train_flow_latent.py --exp celeb256_f8_adm \
     --dataset celeba_256 --datadir ../cnf_flow/data/celeba/celeba-lmdb \
     --batch_size 112 --num_epoch 500 \
     --image_size 256 --f 8 --num_in_channels 4 --num_out_channels 4 \
     --nf 256 --ch_mult 1 2 2 2 --attn_resolution 16 8 --num_res_blocks 2 \
     --lr 2e-5 --scale_factor 0.18215 \
     --save_content --save_content_every 10 \
     --use_origin_adm

it seems that the --f 8 argument make it ambiguous, giving an error like this:

accelerate <command> [<args>] launch: error: ambiguous option: --f could match --fsdp_offload_params, --fsdp_min_num_params, --fsdp_sharding_strategy, --fsdp_auto_wrap_policy, --fsdp_transformer_layer_cls_to_wrap, --fsdp_backward_prefetch_policy, --fsdp_state_dict_type, --fp16
  1. When adding an option --use_ema, there is another error:
Traceback (most recent call last):                                                         
  File "/srv/username/LFM/train_flow_latent.py", line 342, in <module>                     
    train(args)                                                                            
  File "/srv/username/LFM/train_flow_latent.py", line 93, in train                         
    data_loader, model, optimizer, scheduler = accelerator.prepare(data_loader, model, opti
mizer, scheduler)                                                                          
  File "/srv/username/miniconda3/envs/LFMorig/lib/python3.10/site-packages/accelerate/accel
erator.py", line 621, in prepare                                                           
    result = tuple(self._prepare_one(obj, first_pass=True) for obj in args)
  File "/srv/username/miniconda3/envs/LFMorig/lib/python3.10/site-packages/accelerate/accelerator.py", line 621, in <genexpr>
    result = tuple(self._prepare_one(obj, first_pass=True) for obj in args)
  File "/srv/username/miniconda3/envs/LFMorig/lib/python3.10/site-packages/accelerate/accelerator.py", line 520, in _prepare_one
    optimizer = self.prepare_optimizer(obj)
  File "/srv/username/miniconda3/envs/LFMorig/lib/python3.10/site-packages/accelerate/accelerator.py", line 854, in prepare_optimizer
    optimizer = AcceleratedOptimizer(optimizer, device_placement=self.device_placement, scaler=self.scaler)
  File "/srv/username/miniconda3/envs/LFMorig/lib/python3.10/site-packages/accelerate/optim
izer.py", line 70, in __init__
    self.optimizer.load_state_dict(state_dict)
  File "/srv/username/LFM/EMA.py", line 66, in load_state_dict
    super(EMA, self).load_state_dict(state_dict)
  File "/srv/username/miniconda3/envs/LFMorig/lib/python3.10/site-packages/torch/optim/opti
mizer.py", line 433, in load_state_dict
    self.__setstate__({'state': state, 'param_groups': param_groups})
  File "/srv/username/miniconda3/envs/LFMorig/lib/python3.10/site-packages/torch/optim/opti
mizer.py", line 214, in __setstate__
    self.defaults.setdefault('differentiable', False)
AttributeError: 'EMA' object has no attribute 'defaults'

Could you please kindly share your torch and accelerate version for running the scripts? Thank you!

@Hope7Happiness Hope7Happiness changed the title torch & accelarate version torch & accelerate version Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant