Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

does cogvideox-fun supports t2v finetuning? #109

Open
fenghe12 opened this issue Jan 9, 2025 · 1 comment
Open

does cogvideox-fun supports t2v finetuning? #109

fenghe12 opened this issue Jan 9, 2025 · 1 comment

Comments

@fenghe12
Copy link

fenghe12 commented Jan 9, 2025

I get following error if simplly change "inpaint" training mode to "normal":
RuntimeError: Given groups=1, weight of size [1920, 33, 2, 2], expected input[2, 16, 80, 80] to have 33 channels, but got 16 channels instead
Traceback (most recent call last):
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/code/CogVideoX-Fun/scripts/train.py", line 1706, in
main()
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/code/CogVideoX-Fun/scripts/train.py", line 1559, in main
noise_pred = transformer3d(
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1523, in forward
else self._run_ddp_forward(*inputs, **kwargs)
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1359, in _run_ddp_forward
return self.module(*inputs, **kwargs) # type: ignore[index]
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/accelerate/utils/operations.py", line 820, in forward
return model_forward(*args, **kwargs)
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/accelerate/utils/operations.py", line 808, in call
return convert_to_fp32(self.model_forward(*args, **kwargs))
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast
return func(*args, **kwargs)
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/code/CogVideoX-Fun/cogvideox/models/transformer3d.py", line 474, in forward
hidden_states = self.patch_embed(encoder_hidden_states, hidden_states)
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/code/CogVideoX-Fun/cogvideox/models/transformer3d.py", line 67, in forward
image_embeds = self.proj(image_embeds)
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 460, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/mnt/pfs-mc0p4k/tts/team/digital_avatar_group/fenghe/conda_envs/easyanimate/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 456, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [1920, 33, 2, 2], expected input[6, 16, 48, 48] to have 33 channels, but got

@fenghe12
Copy link
Author

fenghe12 commented Jan 9, 2025

seems like still using config of "inpaint" training mode.

@fenghe12 fenghe12 changed the title dose cogvideox-fun supports t2v finetuning? does cogvideox-fun supports t2v finetuning? Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant