Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When I use GPU to train the model, I will have problem 'CUDA out of memory...' #7

Open
RobinLu1209 opened this issue Nov 10, 2019 · 8 comments

Comments

@RobinLu1209
Copy link

Is there someone who also has this problem?

@Cestbo
Copy link

Cestbo commented Nov 27, 2019

me too

@RobinLu1209
Copy link
Author

me too

I have solved this problem. I remember that there are some code written like to be [to(device='cpu')]
You can change these code to be 'cuda', and have a try again. Maybe you will solve this problem.

@Cestbo
Copy link

Cestbo commented Nov 27, 2019

Is there someone who also has this problem?

me too

I have solved this problem. I remember that there are some code written like to be [to(device='cpu')]
You can change these code to be 'cuda', and have a try again. Maybe you will solve this problem.

thank you ,i will try. i always think my gpu is weak

@Noahprog
Copy link

Nah, the problem is in validation method! In training 'batch_size' is used to prevent memory overload, but when validating it passes in the whole validation set. This means memory gets blown up.

@vsfh
Copy link

vsfh commented Jul 9, 2020

Nah, the problem is in validation method! In training 'batch_size' is used to prevent memory overload, but when validating it passes in the whole validation set. This means memory gets blown up.

do you know how to fix it, thx

@Noahprog
Copy link

Noahprog commented Jul 9, 2020

Yes, but I don’t have time to do it for you. The solution is to rewrite the validation to also work with batches.

The training is already done this way, so that is a nice example for this.

@LiaoLW
Copy link

LiaoLW commented Sep 6, 2020

Nah, the problem is in validation method! In training 'batch_size' is used to prevent memory overload, but when validating it passes in the whole validation set. This means memory gets blown up.

do you know how to fix it, thx

Change the split line into about 0.9/0.95, then my problem solved
image

@Coolgiserz
Copy link

I change the dataset split (0.8/0.1/0.1) and solve this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants