We are actively accepting code contributions to the AutoGluon project. If you are interested in contributing to AutoGluon, please read the Contributing Guide to get started.

<<<<<<< HEAD

AutoML for Text, Image, and Tabular Data

AutoGluon automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just a few lines of code, you can train and deploy high-accuracy machine learning and deep learning models on text, image, and tabular data.

Example

# First install package from terminal:
# python3 -m pip install -U pip
# python3 -m pip install -U setuptools wheel
# python3 -m pip install -U "mxnet<2.0.0"
# python3 -m pip install autogluon  # autogluon==0.2.0

from autogluon.tabular import TabularDataset, TabularPredictor
train_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv')
test_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv')
predictor = TabularPredictor(label='class').fit(train_data, time_limit=120)  # Fit models for 120s
leaderboard = predictor.leaderboard(test_data)

AutoGluon Task	Quickstart	API
TabularPredictor
TextPredictor
ImagePredictor
ObjectDetector

News

Announcement for previous users: The AutoGluon codebase has been modularized into namespace packages, which means you now only need those dependencies relevant to your prediction task of interest! For example, you can now work with tabular data without having to install dependencies required for AutoGluon's computer vision tasks (and vice versa). Unfortunately this improvement required a minor API change (eg. instead of from autogluon import TabularPrediction, you should now do: from autogluon.tabular import TabularPredictor), for all versions newer than v0.0.15. Documentation/tutorials under the old API may still be viewed for version 0.0.15 which is the last released version under the old API.

Resources

See the AutoGluon Website for documentation and instructions on:

Installing AutoGluon
Learning with tabular data
- Tips to maximize accuracy (if benchmarking, make sure to run fit() with argument presets='best_quality').
Learning with text data
Learning with image data
More advanced topics such as Neural Architecture Search

Scientific Publications

AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data (Arxiv, 2020)

Articles

AutoGluon for tabular data: 3 lines of code to achieve top 1% in Kaggle competitions (AWS Open Source Blog, Mar 2020)
Accurate image classification in 3 lines of code with AutoGluon (Medium, Feb 2020)
AutoGluon overview & example applications (Towards Data Science, Dec 2019)

Hands-on Tutorials

Train/Deploy AutoGluon in the Cloud

Citing AutoGluon

If you use AutoGluon in a scientific publication, please cite the following paper:

Erickson, Nick, et al. "AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data." arXiv preprint arXiv:2003.06505 (2020).

BibTeX entry:

@article{agtabular,
  title={AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data},
  author={Erickson, Nick and Mueller, Jonas and Shirkov, Alexander and Zhang, Hang and Larroy, Pedro and Li, Mu and Smola, Alexander},
  journal={arXiv preprint arXiv:2003.06505},
  year={2020}
}

AutoGluon for Hyperparameter and Neural Architecture Search (HNAS)

AutoGluon also provides state-of-the-art tools for neural hyperparameter and architecture search, such as for example ASHA, Hyperband, Bayesian Optimization and BOHB. To get started, checkout the following resources

Also have a look at our paper "Model-based Asynchronous Hyperparameter and Neural Architecture Search" arXiv preprint arXiv:2003.10865 (2020).

@article{abohb,
  title={Model-based Asynchronous Hyperparameter and Neural Architecture Search},
  author={Klein, Aaron and Tiao, Louis and Lienart, Thibaut and Archambeau, Cedric and Seeger, Matthias},
  journal={arXiv preprint arXiv:2003.10865},
  year={2020}
}

AutoGluon for Constrained Hyperparameter Optimization

AutoGluon includes an algorithm for constrained hyperparameter optimization. Check out our paper applying it to optimize model performance under fairness constraints: "Fair Bayesian Optimization", AIES (2021).

@article{fairbo,
  title={Fair Bayesian Optimization},
  author={Perrone, Valerio and Donini, Michele and Zafar, Bilal Muhammad and Schmucker, Robin and Kenthapadi, Krishnaram and Archambeau, Cédric},
  journal={AIES},
  year={2021}
}

License

This library is licensed under the Apache 2.0 License.

Contributing to AutoGluon

We are actively accepting code contributions to the AutoGluon project. If you are interested in contributing to AutoGluon, please read the Contributing Guide to get started.

DeepInsight

This repository contains the original MatLab code for DeepInsight as described in the paper DeepInsight: A methodology to transform a non-image data to an image for convolution neural network architecture.

pyDeepInsight

This package provides a python version of the image transformation procedure of DeepInsight. This is not guaranteed to give the same results as the published MatLab code and should be considered experimental.

Installation

python3 -m pip -q install git+git://github.com/alok-ai-lab/DeepInsight.git#egg=DeepInsight

Usage

The following is a walkthrough of standard usage of the ImageTransformer class

from pyDeepInsight import ImageTransformer, LogScaler
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np

from matplotlib import pyplot as plt
import matplotlib.ticker as ticker
import seaborn as sns

Load example TCGA data

expr_file = r"./examples/data/tcga.rnaseq_fpkm_uq.example.txt.gz"
expr = pd.read_csv(expr_file, sep="\t")
y = expr['project'].values
X = expr.iloc[:, 1:].values
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=23, stratify=y)
X_train.shape

(480, 5000)

Normalize data to values between 0 and 1. The following normalization procedure is described in the DeepInsight paper supplementary information as norm-2.

ln = LogScaler()
X_train_norm = ln.fit_transform(X_train)
X_test_norm = ln.transform(X_test)

Initialize image transformer. There are three built-in feature extraction options, 'tsne', 'pca', and 'kpca' to align with the original MatLab implementation.

it = ImageTransformer(feature_extractor='tsne', 
                      pixels=50, random_state=1701, 
                      n_jobs=-1)

Alternatively, any class instance with method .fit_transform() that returns a 2-dimensional array of extracted features can also be provided to the ImageTransformer class. This allows for customization of the feature extraction procedure.

from sklearn.manifold import TSNE

tsne = TSNE(n_components=2, perplexity=30, metric='cosine',
            random_state=1701, n_jobs=-1)

it = ImageTransformer(feature_extractor=tsne, pixels=50)

Train image transformer on training data. Setting plot=True results in at a plot showing the reduced features (blue points), convex full (red), and minimum bounding rectagle (green) prior to rotation.

plt.figure(figsize=(5, 5))
_ = it.fit(X_train_norm, plot=True)

The feature density matrix can be extracted from the trained transformer in order to view overall feature overlap.

fdm = it.feature_density_matrix()
fdm[fdm == 0] = np.nan

plt.figure(figsize=(10, 7))

ax = sns.heatmap(fdm, cmap="viridis", linewidths=0.01, 
                 linecolor="lightgrey", square=True)
ax.xaxis.set_major_locator(ticker.MultipleLocator(5))
ax.yaxis.set_major_locator(ticker.MultipleLocator(5))
for _, spine in ax.spines.items():
    spine.set_visible(True)
_ = plt.title("Genes per pixel")

It is possible to update the pixel size without retraining.

px_sizes = [25, (25, 50), 50, 100]

fig, ax = plt.subplots(1, len(px_sizes), figsize=(25, 7))
for ix, px in enumerate(px_sizes):
    it.pixels = px
    fdm = it.feature_density_matrix()
    fdm[fdm == 0] = np.nan
    cax = sns.heatmap(fdm, cmap="viridis", linewidth=0.01, 
                      linecolor="lightgrey", square=True, 
                      ax=ax[ix], cbar=False)
    cax.set_title('Dim {} x {}'.format(*it.pixels))
    for _, spine in cax.spines.items():
        spine.set_visible(True)
    cax.xaxis.set_major_locator(ticker.MultipleLocator(5))
    cax.yaxis.set_major_locator(ticker.MultipleLocator(5))
plt.tight_layout()    
    
it.pixels = 50

The trained transformer can then be used to transform sample data to image matricies.

X_train_img = it.transform(X_train_norm)

Fit and transform can be done in a single step.

X_train_img = it.fit_transform(X_train_norm)

Plotting the image matrices first four samples of the training set.

fig, ax = plt.subplots(1, 4, figsize=(25, 7))
for i in range(0,4):
    ax[i].imshow(X_train_img[i])
    ax[i].title.set_text("Train[{}] - class '{}'".format(i, y_train[i]))
plt.tight_layout()

Transforming the testing data is done the same as transforming the training data.

X_test_img = it.transform(X_test_norm)

fig, ax = plt.subplots(1, 4, figsize=(25, 7))
for i in range(0,4):
    ax[i].imshow(X_test_img[i])
    ax[i].title.set_text("Test[{}] - class '{}'".format(i, y_test[i]))
plt.tight_layout()

The image matrices can then be used as input for the CNN model.

DeepInsight/master

Name		Name	Last commit message	Last commit date
Latest commit History 777 Commits
.github		.github
README_files		README_files
autogluon		autogluon
core		core
docs		docs
examples		examples
extra		extra
features		features
forecasting		forecasting
matlab		matlab
mxnet		mxnet
pyDeepInsight		pyDeepInsight
tabular		tabular
tabular_to_image		tabular_to_image
text		text
vision		vision
.gitignore		.gitignore
.gitmodules		.gitmodules
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
NOTICE		NOTICE
README.ipynb		README.ipynb
README.md		README.md
VERSION		VERSION
full_install.sh		full_install.sh
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoML for Text, Image, and Tabular Data

Example

News

Resources

Scientific Publications

Articles

Hands-on Tutorials

Train/Deploy AutoGluon in the Cloud

Citing AutoGluon

AutoGluon for Hyperparameter and Neural Architecture Search (HNAS)

AutoGluon for Constrained Hyperparameter Optimization

License

Contributing to AutoGluon

We are actively accepting code contributions to the AutoGluon project. If you are interested in contributing to AutoGluon, please read the Contributing Guide to get started.

DeepInsight

pyDeepInsight

Installation

Usage

About

Releases

Packages

Languages

License

sarah127/autogluon

Folders and files

Latest commit

History

Repository files navigation

AutoML for Text, Image, and Tabular Data

Example

News

Resources

Scientific Publications

Articles

Hands-on Tutorials

Train/Deploy AutoGluon in the Cloud

Citing AutoGluon

AutoGluon for Hyperparameter and Neural Architecture Search (HNAS)

AutoGluon for Constrained Hyperparameter Optimization

License

Contributing to AutoGluon

We are actively accepting code contributions to the AutoGluon project. If you are interested in contributing to AutoGluon, please read the Contributing Guide to get started.

DeepInsight

pyDeepInsight

Installation

Usage

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages