Is there such a thing as too much training data? #9377
Answered
by
ljvmiranda921
DarkSoliditi
asked this question in
Help: Model Advice
-
I working on an address model and have thousands of examples. I was going to label 1000 examples. Is that too much? Is there a general amount of test data you should have as a minimum? |
Beta Was this translation helpful? Give feedback.
Answered by
ljvmiranda921
Oct 6, 2021
Replies: 1 comment 2 replies
-
Hi @DarkSoliditi , there's no fixed rule in the amount of samples you're going to train. However, you still need to note the following:
My suggestion is to try your 1000 examples first, check your data distribution (if they lean towards a particular label or not), apply the usual cross-validation techniques, and decide if you will still need more samples. |
Beta Was this translation helpful? Give feedback.
2 replies
Answer selected by
DarkSoliditi
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi @DarkSoliditi , there's no fixed rule in the amount of samples you're going to train. However, you still need to note the following:
My suggestion is to try your 1000 examples first, check your data distribution (if they lean towards a particular label or not), apply the usual cross-validation techniques, and decide if you will still need more samples.