Complexity-Driven Feature Construction

Feature engineering is a critical but time-consuming task in machine learning. In particular, in cases where raw features can be transformed and combined into new features, the search space is exponentially large. Existing feature selection methods try to identify the best representations. However, the selected feature representations are often very complex, hard to understand, and might suffer from overfitting. Therefore, we propose a system that leverages feature set complexity to prune the huge feature search space. Preliminary experiments show that our system generates representations that are less complex, yield higher classification accuracy, and generalize better to unseen data than current state-of-the-art feature selection and construction methods.

Using our system

To run the experiments, first, you need to set the paths in a configuration file with the name of your machine. Examples can be found here: ~/new_project/fastsklearnfeature/configuration/resources

We provide a small jupyter notebook as an example: Example Notebook

Setup

cd new_project/
python3.7 -m pip install .

Experiments

We already applied our system for the datasets Blood Transfusion Service Center, Banknote Authentication, Ecoli, Statlog (Heart), German Credit, House Prices:

Name		Name	Last commit message	Last commit date
Latest commit History 427 Commits
model		model
new_project		new_project
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Complexity-Driven Feature Construction

Using our system

Setup

Experiments

About

Releases

Packages

Languages

esmailoghli/Complexity-Driven-Feature-Construction

Folders and files

Latest commit

History

Repository files navigation

Complexity-Driven Feature Construction

Using our system

Setup

Experiments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages