Utils functions

from codeE.utils import ...

Function to use in the crowdsourcing scenario

Get Individual matrices

codeE.utils.generate_Individual_conf(Z_data, annotations, DTYPE_OP='float32')

To generate the individual confusion matrix of multiple annotators $\beta_{k,j}^{(t)} = p(y=j | z=k, a=t)$

Parameters

Z_data: array-like of shape (n_samples, n_classes)
The ground truth of the data in a one-hot vector format.
annotations: array-like of shape (n_samples, n_annotators, n_classes)
The multiple annotations observed by the data in the individual categorical (onehot) format, if an annotator does not annotate it has a one-hot of zeros.
DTYPE_OP: string, default='float32'
dtype of numpy array, restricted to https://numpy.org/devdocs/user/basics.types.html

Returns

I_conf: array-like of shape (n_annotators, n_classes, n_classes)
The $\beta_{k,j}^{(t)}$

Examples

import numpy as np
import keras
N = 100 #data
K = 8 #classes
T = 10 # annotators
Z = np.random.randint(K, size=(N,))
Y = np.random.randint(K, size=(N,T))
Y_ohv = keras.utils.to_categorical(Y)
from codeE.utils import generate_Individual_conf
generate_Individual_conf(Z, Y_ohv)

Get Global matrix

codeE.utils.generate_Global_conf(Z_data, annotations, DTYPE_OP='float32')

To generate the global confusion matrix of the multiple annotations $\beta_{k,j} = p(y=j | z=k)$

Parameters

Z_data: array-like of shape (n_samples, n_classes) or (n_samples,)
The ground truth of the data, could be in both format, classes or one-hot vector.
annotations: array-like of shape (n_samples, n_annotators, n_classes) or (n_samples, n_classes)
The annotations observed by the data, could be in both format: individual (categorical one-hot) or global. For further details see representation documentation.
DTYPE_OP: string, default='float32'
dtype of numpy array, restricted to https://numpy.org/devdocs/user/basics.types.html

Returns

G_conf: array-like of shape (n_classes, n_classes)
The $\beta_{k,j}$

Examples

import numpy as np
N = 100 #data
K = 8 #classes
Z = np.random.randint(K, size=(N,))
R = np.random.randint(3, size=(N,K))
from codeE.utils import generate_Global_conf
generate_Global_conf(Z, R)

To visualize you can use

from codeE.utils import plot_confusion_matrix
plot_confusion_matrix(generate_Global_conf(Z, R), title= "Global Matrix")

codeE.utils.get_confusionM(*args)

Perform same operation that generate_Global_conf

Pre-train neural network

codeE.utils.pre_init_F(model, X_inp, Z_targ, n_init, batch_size=32)

Train the neural net model and reset the optimizer, as a pre-train step.

Parameters

model: function or class of keras model
Predictive model based on Keras.
X_inp: array-like of shape (n_samples, ...)
Input patterns of the data.
Z_targ: array-like of shape (n_samples, n_classes)
The estimation of the ground truth to pre-train the model.
n_init_Z: int, default=0
The number of epochs that the predictive model is going to be pre-trained.
batch_size: int, default=32
Number of samples per gradient update, based on https://keras.io/api/models/model_training_apis/

Examples

... #read some data 
X_data = ...
Z_hat = ...

Define predictive model (based on keras)

model_B = Sequential()
... #add layers

Use it

from codeE.utils import pre_init_F
pre_init_F(model_B, X_data, Z_hat, n_init=3)

Visual comparison of confusion matrix

codeE.utils.compare_conf_ma(pred_conf_mat, true_conf_mat=[], text=False)

To compare a predicted confusion matrix against the true values, it must sum one per row.

Parameters

pred_conf_mat: array-like of shape (n_classes, n_classes)
An estimation (predicted) confusion matrix $\hat{\beta}_{k,j}$
true_conf_mat: array-like of shape (n_classes, n_classes), default=[]
The true confusion matrix $\beta_{k,j}$ If empty list is given, it does not show this matrix.
text: boolean, default=False
If True it will print the probabilities inside the confusion matrix plot.

Visual comparison of set confusion matrices

codeE.utils.compare_set_conf_ma(set_conf_ma, true_set_conf_ma = [], text=True, n_samp=0)

To compare a set of predicted confusion matrix against the set of true values, the confusion matrix must sum one per row. Parameters

set_conf_ma: array-like of shape (n_matrices, n_classes, n_classes)
A set of confusion matrices estimation $\hat{B}=\{\hat{\beta}_{k,j}^{(m)}\}_{m=1}^M$
true_set_conf_ma: array-like of shape (n_matrices, n_classes, n_classes), default=[]
The set of the true confusion matrices ${B}=\{{\beta}_{k,j}^{(m)}\}_{m=1}^M$ If empty list is given, it does not show this matrix.
text: boolean, default=False
If True it will print the probabilities inside the confusion matrix plot, besides an indicator of the estimation error on the printed matrices.
n_samp: int, default=0
The number of matrices to randomly plot from the set. If n_samp=0, it will print all the matrices from the set.

Cluster Annotations

codeE.utils.clusterize_annotators(y_o,M,no_label=-1,bulk=True,cluster_type='mv_close',data=[],model=None,DTYPE_OP='float32',BATCH_SIZE=64,option="hard",l=0.005)

To clusterize on crowdsourcing, as initial step of groups behavior p(g).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

utils.md

utils.md

Utils functions

Get Individual matrices

Examples

Get Global matrix

Examples

Pre-train neural network

Examples

Visual comparison of confusion matrix

Visual comparison of set confusion matrices

Cluster Annotations

Files

utils.md

Latest commit

History

utils.md

File metadata and controls

Utils functions

Get Individual matrices

Examples

Get Global matrix

Examples

Pre-train neural network

Examples

Visual comparison of confusion matrix

Visual comparison of set confusion matrices

Cluster Annotations