Skip to content

Commit

Permalink
add AE1SVM
Browse files Browse the repository at this point in the history
  • Loading branch information
yzhao062 committed Jul 2, 2024
1 parent a1d97df commit fbce11b
Show file tree
Hide file tree
Showing 8 changed files with 80 additions and 30 deletions.
3 changes: 2 additions & 1 deletion CHANGES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -191,4 +191,5 @@ v<2.0.0>, <05/21/2024> -- Moving from TF to Torch -- implement dl base with more
v<2.0.1>, <06/16/2024> -- Moving from TF to Torch -- reimplement DeepSVDD.
v<2.0.1>, <06/17/2024> -- Moving from TF to Torch -- reimplement dl_base.
v<2.0.1>, <06/21/2024> -- Moving from TF to Torch -- reimplement MO_GAAL.
v<2.0.1>, <06/21/2024> -- Moving from TF to Torch -- reimplement AE and VAE.
v<2.0.1>, <06/21/2024> -- Moving from TF to Torch -- reimplement AE and VAE.
v<2.0.2>, <07/01/2024> -- Add AE1SVM.
3 changes: 3 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -408,6 +408,7 @@ Neural Networks MO_GAAL Multiple-Objective Generative Adversari
Neural Networks DeepSVDD Deep One-Class Classification 2018 [#Ruff2018Deep]_
Neural Networks AnoGAN Anomaly Detection with Generative Adversarial Networks 2017 [#Schlegl2017Unsupervised]_
Neural Networks ALAD Adversarially learned anomaly detection 2018 [#Zenati2018Adversarially]_
Neural Networks AE1SVM Autoencoder-based One-class Support Vector Machine 2019 [#Nguyen2019scalable]_
Graph-based R-Graph Outlier detection by R-graph 2017 [#You2017Provable]_
Graph-based LUNAR LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks 2022 [#Goodge2022Lunar]_
=================== ================== ====================================================================================================== ===== ========================================
Expand Down Expand Up @@ -628,6 +629,8 @@ Reference
.. [#Liu2019Generative] Liu, Y., Li, Z., Zhou, C., Jiang, Y., Sun, J., Wang, M. and He, X., 2019. Generative adversarial active learning for unsupervised outlier detection. *IEEE Transactions on Knowledge and Data Engineering*.
.. [#Nguyen2019scalable] Nguyen, M.N. and Vien, N.A., 2019. Scalable and interpretable one-class svms with deep learning and random fourier features. In *Machine Learning and Knowledge Discovery in Databases: European Conference*, ECML PKDD, 2018.
.. [#Papadimitriou2003LOCI] Papadimitriou, S., Kitagawa, H., Gibbons, P.B. and Faloutsos, C., 2003, March. LOCI: Fast outlier detection using the local correlation integral. In *ICDE '03*, pp. 315-326. IEEE.
.. [#Pevny2016Loda] Pevný, T., 2016. Loda: Lightweight on-line detector of anomalies. *Machine Learning*, 102(2), pp.275-304.
Expand Down
5 changes: 3 additions & 2 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -245,8 +245,9 @@ Neural Networks SO_GAAL Single-Objective Generative Adversarial A
Neural Networks MO_GAAL Multiple-Objective Generative Adversarial Active Learning 2019 :class:`pyod.models.mo_gaal.MO_GAAL` :cite:`a-liu2019generative`
Neural Networks DeepSVDD Deep One-Class Classification 2018 :class:`pyod.models.deep_svdd.DeepSVDD` :cite:`a-ruff2018deepsvdd`
Neural Networks AnoGAN Anomaly Detection with Generative Adversarial Networks 2017 :class:`pyod.models.anogan.AnoGAN` :cite:`a-schlegl2017unsupervised`
Neural Networks ALAD Adversarially learned anomaly detection 2018 :class:`pyod.models.alad.ALAD` :cite:`a-zenati2018adversarially`
Graph-based R-Graph Outlier detection by R-graph 2017 :class:`pyod.models.rgraph.RGraph` :cite:`you2017provable`
Neural Networks ALAD Adversarially learned anomaly detection 2018 :class:`pyod.models.alad.ALAD` :cite:`a-zenati2018adversarially`
Neural Networks AE1SVM Autoencoder-based One-class Support Vector Machine 2019 :class:`pyod.models.ae1svm.AE1SVM` :cite:`a-nguyen2019scalable`
Graph-based R-Graph Outlier detection by R-graph 2017 :class:`pyod.models.rgraph.RGraph` :cite:`a-you2017provable`
Graph-based LUNAR LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks 2022 :class:`pyod.models.lunar.LUNAR` :cite:`a-goodge2022lunar`
=================== ================ ====================================================================================================== ===== =================================================== ======================================================

Expand Down
12 changes: 12 additions & 0 deletions docs/pyod.models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,18 @@ pyod.models.abod module
:show-inheritance:
:inherited-members:


pyod.models.ae1svm module
-------------------------

.. automodule:: pyod.models.ae1svm
:members:
:exclude-members: RandomFourierFeatures, InnerAE1SVM, _train_autoencoder
:undoc-members:
:show-inheritance:
:inherited-members:


pyod.models.alad module
-----------------------

Expand Down
9 changes: 9 additions & 0 deletions docs/zreferences.bib
Original file line number Diff line number Diff line change
Expand Up @@ -500,4 +500,13 @@ @article{xu2023dif
number={},
pages={1-14},
doi={10.1109/TKDE.2023.3270293}
}

@inproceedings{nguyen2019scalable,
title={Scalable and interpretable one-class svms with deep learning and random fourier features},
author={Nguyen, Minh-Nghia and Vien, Ngo Anh},
booktitle={Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2018, Dublin, Ireland, September 10--14, 2018, Proceedings, Part I 18},
pages={157--172},
year={2019},
organization={Springer}
}
3 changes: 0 additions & 3 deletions examples/ae1svm_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@
"""
# Author: Zhuo Xiao <[email protected]>

from __future__ import division, print_function

import os
import sys

Expand All @@ -17,7 +15,6 @@
from pyod.utils.data import generate_data
from pyod.utils.data import evaluate_print


if __name__ == "__main__":
contamination = 0.1 # percentage of outliers
n_train = 20000 # number of training points
Expand Down
53 changes: 36 additions & 17 deletions pyod/models/ae1svm.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
"""
# Author: Zhuo Xiao <[email protected]>

from __future__ import division, print_function

import numpy as np
import torch
Expand Down Expand Up @@ -37,7 +36,8 @@ def __getitem__(self, idx):


class InnerAE1SVM(nn.Module):
def __init__(self, n_features, encoding_dim, rff_dim, sigma=1.0, hidden_neurons=(128, 64),
def __init__(self, n_features, encoding_dim, rff_dim, sigma=1.0,
hidden_neurons=(128, 64),
dropout_rate=0.2, batch_norm=True, hidden_activation='relu'):
super(InnerAE1SVM, self).__init__()

Expand All @@ -52,22 +52,27 @@ def __init__(self, n_features, encoding_dim, rff_dim, sigma=1.0, hidden_neurons=

for idx in range(len(layers_neurons_encoder) - 1):
self.encoder.add_module(f"linear{idx}",
nn.Linear(layers_neurons_encoder[idx], layers_neurons_encoder[idx + 1]))
nn.Linear(layers_neurons_encoder[idx],
layers_neurons_encoder[idx + 1]))
if batch_norm:
self.encoder.add_module(f"batch_norm{idx}", nn.BatchNorm1d(layers_neurons_encoder[idx + 1]))
self.encoder.add_module(f"batch_norm{idx}", nn.BatchNorm1d(
layers_neurons_encoder[idx + 1]))
self.encoder.add_module(f"activation{idx}", activation)
self.encoder.add_module(f"dropout{idx}", nn.Dropout(dropout_rate))

layers_neurons_decoder = layers_neurons_encoder[::-1]

for idx in range(len(layers_neurons_decoder) - 1):
self.decoder.add_module(f"linear{idx}",
nn.Linear(layers_neurons_decoder[idx], layers_neurons_decoder[idx + 1]))
nn.Linear(layers_neurons_decoder[idx],
layers_neurons_decoder[idx + 1]))
if batch_norm and idx < len(layers_neurons_decoder) - 2:
self.decoder.add_module(f"batch_norm{idx}", nn.BatchNorm1d(layers_neurons_decoder[idx + 1]))
self.decoder.add_module(f"batch_norm{idx}", nn.BatchNorm1d(
layers_neurons_decoder[idx + 1]))
self.decoder.add_module(f"activation{idx}", activation)
if idx < len(layers_neurons_decoder) - 2:
self.decoder.add_module(f"dropout{idx}", nn.Dropout(dropout_rate))
self.decoder.add_module(f"dropout{idx}",
nn.Dropout(dropout_rate))

def forward(self, x):
x = self.encoder(x)
Expand Down Expand Up @@ -96,7 +101,8 @@ class AE1SVM(BaseDetector):
def __init__(self, hidden_neurons=None, hidden_activation='relu',
batch_norm=True, learning_rate=1e-3, epochs=50, batch_size=32,
dropout_rate=0.2, weight_decay=1e-5, preprocessing=True,
loss_fn=None, contamination=0.1, alpha=1.0, sigma=1.0, nu=0.1, kernel_approx_features=1000):
loss_fn=None, contamination=0.1, alpha=1.0, sigma=1.0, nu=0.1,
kernel_approx_features=1000):
super(AE1SVM, self).__init__(contamination=contamination)

self.model = None
Expand Down Expand Up @@ -133,11 +139,16 @@ def fit(self, X, y=None):
else:
train_set = PyODDataset(X=X)

train_loader = torch.utils.data.DataLoader(train_set, batch_size=self.batch_size, shuffle=True)
self.model = InnerAE1SVM(n_features=n_features, encoding_dim=32, rff_dim=self.kernel_approx_features,
train_loader = torch.utils.data.DataLoader(train_set,
batch_size=self.batch_size,
shuffle=True)
self.model = InnerAE1SVM(n_features=n_features, encoding_dim=32,
rff_dim=self.kernel_approx_features,
sigma=self.sigma,
hidden_neurons=self.hidden_neurons, dropout_rate=self.dropout_rate,
batch_norm=self.batch_norm, hidden_activation=self.hidden_activation)
hidden_neurons=self.hidden_neurons,
dropout_rate=self.dropout_rate,
batch_norm=self.batch_norm,
hidden_activation=self.hidden_activation)
self.model = self.model.to(self.device)
self._train_autoencoder(train_loader)

Expand All @@ -151,7 +162,9 @@ def fit(self, X, y=None):
return self

def _train_autoencoder(self, train_loader):
optimizer = torch.optim.Adam(self.model.parameters(), lr=self.learning_rate, weight_decay=self.weight_decay)
optimizer = torch.optim.Adam(self.model.parameters(),
lr=self.learning_rate,
weight_decay=self.weight_decay)
self.best_loss = float('inf')
self.best_model_dict = None

Expand All @@ -170,7 +183,8 @@ def _train_autoencoder(self, train_loader):
optimizer.step()
overall_loss.append(loss.item())
if (epoch + 1) % 10 == 0:
print(f'Epoch {epoch + 1}/{self.epochs}, Loss: {np.mean(overall_loss)}')
print(
f'Epoch {epoch + 1}/{self.epochs}, Loss: {np.mean(overall_loss)}')

if np.mean(overall_loss) < self.best_loss:
self.best_loss = np.mean(overall_loss)
Expand All @@ -179,15 +193,20 @@ def _train_autoencoder(self, train_loader):
def decision_function(self, X):
check_is_fitted(self, ['model', 'best_model_dict'])
X = check_array(X)
dataset = PyODDataset(X=X, mean=self.mean, std=self.std) if self.preprocessing else PyODDataset(X=X)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=self.batch_size, shuffle=False)
dataset = PyODDataset(X=X, mean=self.mean,
std=self.std) if self.preprocessing else (
PyODDataset(X=X))
dataloader = torch.utils.data.DataLoader(dataset,
batch_size=self.batch_size,
shuffle=False)
self.model.eval()

outlier_scores = np.zeros([X.shape[0], ])
with torch.no_grad():
for data, data_idx in dataloader:
data = data.to(self.device).float()
reconstructions, rff_features = self.model(data)
scores = pairwise_distances_no_broadcast(data.cpu().numpy(), reconstructions.cpu().numpy())
scores = pairwise_distances_no_broadcast(data.cpu().numpy(),
reconstructions.cpu().numpy())
outlier_scores[data_idx] = scores
return outlier_scores
22 changes: 15 additions & 7 deletions pyod/test/test_ae1svm.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,11 @@ def setUp(self):
self.clf.fit(self.X_train)

def test_parameters(self):
assert hasattr(self.clf, 'decision_scores_') and self.clf.decision_scores_ is not None
assert hasattr(self.clf,
'decision_scores_') and self.clf.decision_scores_ is not None
assert hasattr(self.clf, 'labels_') and self.clf.labels_ is not None
assert hasattr(self.clf, 'threshold_') and self.clf.threshold_ is not None
assert hasattr(self.clf,
'threshold_') and self.clf.threshold_ is not None
assert hasattr(self.clf, '_mu') and self.clf._mu is not None
assert hasattr(self.clf, '_sigma') and self.clf._sigma is not None
assert hasattr(self.clf, 'model') and self.clf.model is not None
Expand Down Expand Up @@ -80,14 +82,17 @@ def test_prediction_proba_parameter(self):
self.clf.predict_proba(self.X_test, method='something')

def test_prediction_labels_confidence(self):
pred_labels, confidence = self.clf.predict(self.X_test, return_confidence=True)
pred_labels, confidence = self.clf.predict(self.X_test,
return_confidence=True)
assert_equal(pred_labels.shape, self.y_test.shape)
assert_equal(confidence.shape, self.y_test.shape)
assert confidence.min() >= 0
assert confidence.max() <= 1

def test_prediction_proba_linear_confidence(self):
pred_proba, confidence = self.clf.predict_proba(self.X_test, method='linear', return_confidence=True)
pred_proba, confidence = self.clf.predict_proba(self.X_test,
method='linear',
return_confidence=True)
assert pred_proba.min() >= 0
assert pred_proba.max() <= 1
assert_equal(confidence.shape, self.y_test.shape)
Expand All @@ -100,10 +105,13 @@ def test_fit_predict(self):

def test_fit_predict_score(self):
self.clf.fit_predict_score(self.X_test, self.y_test)
self.clf.fit_predict_score(self.X_test, self.y_test, scoring='roc_auc_score')
self.clf.fit_predict_score(self.X_test, self.y_test, scoring='prc_n_score')
self.clf.fit_predict_score(self.X_test, self.y_test,
scoring='roc_auc_score')
self.clf.fit_predict_score(self.X_test, self.y_test,
scoring='prc_n_score')
with assert_raises(NotImplementedError):
self.clf.fit_predict_score(self.X_test, self.y_test, scoring='something')
self.clf.fit_predict_score(self.X_test, self.y_test,
scoring='something')

def test_model_clone(self):
# for deep models this may not apply
Expand Down

0 comments on commit fbce11b

Please sign in to comment.