Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Importance #282

Open
xnuohz opened this issue Dec 9, 2023 · 5 comments
Open

Feature Importance #282

xnuohz opened this issue Dec 9, 2023 · 5 comments
Assignees
Labels

Comments

@xnuohz
Copy link
Contributor

xnuohz commented Dec 9, 2023

Feature

Support feature importance in tabular data scenarios.

  1. Understand which features are beneficial for prediction and help to develop new features
  2. Feature selection, removing features that are not helpful in prediction

Ideas

  1. GBDTs naturally have APIs for calculating feature importance, it's easy to add.
  2. NNs
    • Permutation. After shuffling a certain feature, observe the changes in metric. The greater the change, the more important the feature is. Simple.
    • SHAP. Complex.
@yiweny
Copy link
Contributor

yiweny commented Dec 9, 2023

Mutual Information Sort is already added here.
For feature sorting in NNs, I recommend you take a look at the ExcelFormer example.
If you are interested in adding any feature related functionalities, you can add it in the transform module.

@xnuohz
Copy link
Contributor Author

xnuohz commented Dec 10, 2023

Thanks. As you mentioned, mutual information sorting and ExcelFormer improve performance through transformation capabilities. However, I want to discuss how much different features contribute to the final prediction result. For example, user behavioral features are important in recommender systems, so their feature importance should be high. pytorch-frame is good to use. It allows me to quickly obtain benchmark results on real-world datasets to determine whether NNs or GBDTs are better. I'm unsure if the functionality to evaluate feature importance is worth integrating as a module into pytorch-frame.

@zechengz
Copy link
Member

zechengz commented Dec 11, 2023

I think you can use Captum https://captum.ai/ to have a try
cc @weihua916 we can also integrate this in PyT?

@xnuohz
Copy link
Contributor Author

xnuohz commented Dec 11, 2023

Yes, Captum implemented many interpretability methods, Feature Permutation and SHAP are part of them.

@February24-Lee
Copy link
Contributor

Is there any update or roadmap related to it? 👀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants