Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

求问一下asuka能否开源? #1

Open
lawsonxwl opened this issue Apr 7, 2024 · 4 comments
Open

求问一下asuka能否开源? #1

lawsonxwl opened this issue Apr 7, 2024 · 4 comments

Comments

@lawsonxwl
Copy link

数据集都开源了 算法应该也快了吧!

@Yikai-Wang
Copy link
Owner

Thank you for your interest in our work. Typically, releasing the code and model requires an administrative process after our paper is officially published. Regrettably, our paper is not yet published, and I'll be graduating soon. So, although I'd like to release them, I can't promise if or when it will happen.

@lawsonxwl
Copy link
Author

Thank you for your interest in our work. Typically, releasing the code and model requires an administrative process after our paper is officially published. Regrettably, our paper is not yet published, and I'll be graduating soon. So, although I'd like to release them, I can't promise if or when it will happen.

感谢耐心回复,之前尝试过复现论文,但只成功复现了vae没有复现mae+sd,想问一下,为什么论文中的p要从100% decay到10%?
在p=0.1时,如果随机mask到foreground object,那这个时候有90%的概率会对mae进行mask,从而使用mae predict的背景作为prior而原图的前景作为gt,导致论文3.2所说的misalignment。我想可不可以直接让p=1,mae不mask,只用reconstruct的结果做prior,这样不管怎么mask,mae prior和gt都能够对应上,sd能够学会mae prior和gt之间的映射。

推理的时候就用mae干净的预测结果做prior,因为prior是干净的,所以生成的图片理论也是干净的(没有随机物体)。这样可行吗?

@Yikai-Wang
Copy link
Owner

This is a great guestion. When use MAE in ASUKA, we should consider at least the following gaps:

  1. The difference in behavior between the MAE reconstruction result (when all patches are known) and the MAE generationresult (when only unmasked patches are known) is noticeable.
  2. While the MAE result provides some guidance, it's not perfect. We shouldn't rely too heavily on it for guiding the SD, as doing so can lead to blurry generation results similar to the MAE result. lt's somewhat challenging to control the guidance to focus solely on the semantic region and not on the texture region.Based on our empirical findings, consistently using p = 1 leads to inferior performance, so we opted not to include it in our paper. However, it's worth noting that our approach isn't necessarily the best way to utilize MAE. Exploring more effective ways to incorporate MAE guidance could be a promising avenue for future research.

@chl916185
Copy link

希望最后能开源,很不错的工作!
@Yikai-Wang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants