-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
question about training #5
Comments
A1: you know 1^50=1, 0.9^50=0.0052. the printf function output the reslut when a sample went through only one tree. |
And another wondering point: why last two field of fea[6] use two random landmark? It seems global learning under global shape constrain. I think this lost "local learning" meaning which i suggest just use random offsets pair nearby one landmark. |
in paper, it said that |
I found some problem about "perform hard sample" and wonder whether the process is right. It needs your check. The neg sample which score is big than RandomForest_[0].rfs_[0][0] will keep as hard negative sample. This is maybe not right. Can you check this? thanks. |
if (current_fi[n]<rfs_[i][j].threshold && ground_truth_faces[n]==-1), then remove neg samples. That what you have said. RandomForest_[0].rfs_[0][0] can not exclude neg samples with big score. Maybe i do not make your problem clear. |
还是中文表述吧,if (current_fi[n]<rfs_[i][j].threshold && ground_truth_faces[n]==-1)用来剔除不符合的负样本,但是在生成新的随机样本来替换该负样本时,似乎判断的逻辑有问题:新的随机生成的负样本在逐个通过RandomForest_[s].rfs_[r][t]检测时,如果一旦if (tmp_fi>=RandomForest_[s].rfs_[r][t].threshold) ,就被保存下来作为补充的负样本。也就是说当前已训练n个弱分类器,新增的随机样本被逐个检验,你在该样本第一次满足tmp_fi>=RandomForest_[s].rfs_[r][t].threshold时就终止并把它补充进来替换当前的负样本,而不是测试它是否能够全部通过n个弱分类器。不知道我说得对不对。 |
if (current_fi[n]>=RandomForest_[s].rfs_[r][t].threshold) ,之前没被排除掉的负样本继续存在。 |
“你在该样本第一次满足tmp_fi>=RandomForest_[s].rfs_[r][t].threshold时就终止并把它补充进来替换当前的负样本,而不是测试它是否能够全部通过n个弱分类器” |
“你在该样本第一次满足tmp_fi>=RandomForest_[s].rfs_[r][t].threshold时就终止并把它补充进来替换当前的负样本,而不是测试它是否能够全部通过n个弱分类器” 这个是你代码的做法,我觉得应该是检测随机的这个样本是否能够通过全部n个弱分类器,把能通过的保留下来替换。 |
是的,你是对的,我稍后修改这个问题。 |
我修改了代码,请帮忙看看是否正确,谢谢 |
主要修改的是两个地方吧:(1). if (tmp_fi<RandomForest_[s].rfs_[r][t].threshold); (2). if (tmp_isface) {current_fi[n]=tmp_fi; current_weight[n]=exp(0.0-ground_truth_faces[n]*current_fi[n]); break;} 。 我试过这个修改,好像会很容易负样本搜集不足,训练会很慢。可能需要遍历搜集负样本。 |
应该会这样,相对之前来说找负样本更难了。 |
好的,我研究一下你说的另外一个bug。 |
to kensun0, |
我测试了下,正样本1W,负样本24W的情况,在stage=0,40棵树,修改前后的耗时如下 提供一个负样本地址,其中非人脸大约8000张,我扩大了30倍,详见 line 683 in LBFRegressor.cpp 召回率不设1也行,但是怎么保证5000颗树后的总体召回呢? |
for (int k=0;k<10;++k),你原文是10倍吧?我是按照你的10倍来的,1.6w的负样本,扩大10倍就是16w,但是卡住了啊,训练几分钟就卡住了。为啥你的没有卡住呢? |
不清楚,你先确定是慢还是卡住。慢的话,没什么可说的。 |
如果是卡住,着重看下是不是卡在这 getRandomBox |
to kensun0,
|
你这不是快找到问题了吗,继续( ⊙ o ⊙ )! |
补充负样本时的3层for循环(变量s,r,t控制的循环)似乎有点问题,是不是应该是先用(cur_stage-1)_num_landmarks_num_trees个tree先检测score是否通过(此处每次stage还需更新shape),然后再用(cur_landmark_id-1)*num_trees个tree检测score是否通过,最后用cur_tree_id个tree检验score是否通过? |
是的。谢谢。 |
这里不是更新了负样本的shape吗?难道我理解错了?
|
这个是BOX相对坐标系和图像坐标系的转换,不是更改SHAPE。 |
我更改了负样本的生成过程,包括每个STAGE后的形状回归。 |
负样本不够,我觉得只能从外界获取足够的负样本加载到内存。 |
这个方法感觉寻找负样本的速度是个问题,困惑中~ |
主要是之前我在一个负样本上画了10倍的窗体,导致负样本消耗过快。。。最近改的这个不增加窗体了,训练速度好些,不过手头负样本不足,导致训练后的模型排除负样本变慢了。 |
When I run the training program, I found the recall rate is always 1 and the false positive rate is always 0.9x. Is the false positive rate too high even in later stages(question 1) such that it has less ability to reject neg window? And I read your code, the threshold for splitting tree node is set as value of random sample. Is this right(question 2)? And the last question 3 is: the number of pos and neg sample is not very equal, why not to perturb mean shape? hope your reply, thanks!
The text was updated successfully, but these errors were encountered: