Contact: Fares Meghdouri - [email protected]
Paper: "Modeling Data with Observers"
@article{meghdouri2022modeling,
title={Modeling data with observers},
author={Meghdouri, Fares and Iglesias V{\'a}zquez, F{\'e}lix and Zseby, Tanja},
journal={Intelligent Data Analysis},
volume={26},
number={3},
pages={785--803},
year={2022},
publisher={IOS Press}
}
pyodm can be installed using pip by running
pip install git+https://github.com/CN-TU/pyodm
Note that in order for ODM to work with an M-Tree core, the implementation (package) in M-Trees needs to be installed. The repository is private and will be available soon.
Please note that many parameters can be adjusted in order to build a representative model refer to the paper for more information.
import pyodm
# create a new model with default parameters
model = pyodm.ODM(random_state=1)
import numpy
#import pandas
# read the data
X = np.load('my_dataset.npy')
#or
#X = pandas.read_csv('my_dataset.csv').values
# model the data
model.fit(X)
# access the array of observers
print(model.observers)
# access the array of radius
print(model.radius)
# access the array of populations
print(model.population)
In order to get the outlierness score of a set of points (based on an ODM model), run the foolowing after fitting a model
# read the data
X_test = np.load('my_test_dataset.npy')
# get the outlierness scores
outlierness_scores = model.outlierness(X_test)
One can convert the outlierness score into a binary label (outlier/inlier) using the following
# read the data
X_test = np.load('my_test_dataset.npy')
# convert the outlierness scores into binary labels using a contamination threshold
predictions = model.predict(X_test)
to get the label of the closest observer to a set of points use
# read the data
X_test = np.load('my_test_dataset.npy')
# return the predicted label of each test point (refering to `model.observers`)
predicted_labels = model.labels(X_test)
# return a dictionnary of the current parameters
model.get_params()
This will return a dictionnary of parameters used to build the model.\
Example1: Three datasets in which datapoints are represented in gray and the ODM model in red each with a different configuration.
Example2: Convergence path of an observer.