-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using FDS/LDS with a custom model and data #3
Comments
Hi @YyzHarry I am having a regression dataset CSV file containing all numeric values. Please let me know https://github.com/YyzHarry/imbalanced-regression/tree/main/imdb-wiki-dir will work my requirement? Thanks |
Hi @ttsesm Yes - that could serve as an example!
|
Hi @snigdhasen Yes, I believe that is a complete codebase, and you might only need to modify the data loading part (and maybe the network you choose to use). |
@YyzHarry I found some time so I was going through the paper and also your blog post as well as the links that you have pointed me to, but I still do not get how you apply the LDS/FDS distribution smoothing in practice. So I would appreciate if you could give a step by step guide how to be done. I think this would be helpful for others as well. For example in my case I have dataset of point clouds where for each point I have a set of feature vectors, e.g.:
Now I want to regress the values of column 4, but these values are imbalanced and can vary from the range of 0-10000. For example for the sample above I have split my values in groups, Now the question is how to apply LDS/FDS based on the values in column 4. Is this done before you load the data in data loader or after and while you are applying the training/testing? Thanks. p.s. I attach also an example of a point cloud with the corresponding complete feature vectors just in case it is useful. |
Hi @YyzHarry, any feedback regarding my question above and possibly a step-by-step guide how to apply LDS/FDS. |
@ttsesm Sorry for the late reply!
This is done after you load the data. For LDS, basically you first get the histogram as you show here for the labels, then we apply smoothing to estimate another "effective" density. After this, typically, LDS is used with loss re-weighting --- you have a weight for each sample to balance the loss. In our case, the implementation for the aforementioned steps can be found here. For FDS, it is done during training --- just a module like BatchNorm --- inserted into your neural network (see example here). And after each training epoch, you will update the running statistics and smoothed statistics (example here). FDS does not depend on how your label is distributed (do not need the histogram for computation), but you need to first define the number of bins (see the initialization of FDS module here, Hope these help. Let me know if you have further questions. |
@YyzHarry thanks a lot for the feedback, it was indeed helpful. So as I understand it with LDS for each target (label) value you create a weight which you then use to balance the loss in a way like the following (this is also what I got from the suppl. material pseudo code in the paper, bellow I use L1Loss as an example):
I played a bit with the LDS, based also on the link that you provided and I created the following running toy example in order to obtain the weights:
which seem to work fine. I have a couple of questions though which I couldn't find the answer or I might have overlooked it:
|
Actually, we use
These just provide more choices. In Appendix E.1 of our paper, we studied some choices of kernel types. Overall, they should give similar results, but some might be better in certain tasks.
The number is just based on the label distribution for the particular age dataset. Since the number of samples with age larger than 120 is very small, we can just aggregate and assign the same weight. The reason is as you said, by applying re-weighting, we do not want the weight to be too high and cause optimization issue.
Your understanding is correct. This is related to the above questions.
Yes, your understanding is correct. As for your case, it also depends on what the minimum resolution you care (i.e., the bin size). For age, the minimum resolution we care is 1 year, so the bins are 100 if we consider the ages from 0-99. If your minimum resolution that matters is 10, your bins could be 2500 in accordance. Smaller # of bins will make the statistics estimation more accurate, as more samples are considered in each bin. Again, the choices should depend on tasks you are tackling. |
Hi @YyzHarry, thanks for the feedback and your time. I will try to play a bit with the different settings and I will let you know if I have any further questions. |
Hi @YyzHarry, Thanks for your github link |
Hi @snigdhasen It seems the loss is gradually decreasing (though very slow). I guess the value in the parentheses is the average value for MSE/L1 loss. |
@YyzHarry Thanks . yes thats the average loss. But MSE is too high around .99. Can you suggest any customization to reduce loss here. L1 loss is ok around .39. |
Hi @YyzHarry, |
Hi @zhaosongyi - this is an interesting point. In our work, we use a symmetric kernel since we assume the distance with respect to an anchor point in the target space should not depend on the sign (e.g., for age estimation, 10 year-old and 14 year-old have the same distance to a 12 year-old). Another implicit benefit from symmetric kernels is that they are theoretically guaranteed to make the distribution "smoother" (has lower lipschitz constant). Going back to your case, when you apply a log transformation to the target labels (and if we still assume the distance for the original processing-time labels does not depend on the sign), I guess you might want to try an asymmetric kernel. A simple implementation with a Gaussian kernel could be a combination of two half Gaussians with different \sigma, where you have a larger \sigma for the left half and a smaller one for the right half. |
@YyzHarry Hii I applied only LDS on my Dataset but I am not seeing any improvement in training or validation. Do I need to apply both FDS and LDS on Boston like dataset ? @ttsesm if this method worked for you please ping me on [email protected] |
Hi @YyzHarry I have some question about only use LDS on my dataset,however, errors are always reported during operation. Now I want to know that if there are specific format requirements for the input data?I'm dealing with spatiotemporal data with longitude and latitude. I don't know if I can? |
I'm trying to run the train.py in the nyud2-dir directory you provided, and I'm getting negative weight values, which causes the final calculated loss to be negative as well. Also would like to ask what is the meaning of TRAIN_BUCKET_NUM? How is this data calculated? |
Hi @YyzHarry,
I am trying to adapt the example from here https://github.com/YyzHarry/imbalanced-regression/tree/main/agedb-dir with my custom model and data. Thus, I would like to ask you whether this would be feasible and if yes if there are any example showing explicitly how to do that.
Thanks.
The text was updated successfully, but these errors were encountered: