Learning from Crowds Solutions

As this is an unsupervised problem of learning the ground truth based on the labeling process it can be solved with different approaches. Each method will try to model the annotation behavior in different ways and in different settings, providing different solutions for what is necessary.

Differences in Methodology

Some notation comments:

To more details in the problem notation see the documentation.

z correspond to the ground truth of the data.
e correspondn to the reliability of the annotators.
T correspond the number of annotators n_annotators
K correspond to the number of classes n_classes
M correspond to the number of groups in some models: n_groups
W correspond to the number of parameters of some predictive model
Wm correspond to the number of parameters of the group model of Model Inference EM - Groups (gating network of MoE)

Method name	Inferred variable	Predictive model	Setting	Annotator model	Other model	Learnable parameters
Label Aggregation	-	❌	Global	-	-	0
Label Inference EM	z	❌	Individual dense	Probabilistic confusion matrix	Class marginals	$\mathcal{O}(TK^2 + K)$
Label Inference EM - Global	z	❌	Global	-	global Probabilistic confusion matrix	$\mathcal{O}(K^2 + K)$
Model Inference EM	z	✔️	Individual dense	Probabilistic confusion matrix	-	$\mathcal{O}(TK^2 + W)$
Model Inference EM - Groups	z	✔️	Individual sparse	-	Probabilistic confusion matrix per group, gating network over groups	$\mathcal{O}(MK^2 + W + W_m)$
Model Inference EM - Groups Global	z	✔️	Global	-	Probabilistic confusion matrix per group, group marginals	$\mathcal{O}(MK^2 + W + M)$
Model Inference EM - Global	z	✔️	Global	-	global Probabilistic confusion matrix	$\mathcal{O}(K^2 + W)$
Model Inference - Reliability EM	e	✔️	Individual dense	Probabilistic reliability number	-	$\mathcal{O}(T + W)$
Model Inference BP	-	✔️	Individual dense (masked)	Confusion matrix weights	-	$\mathcal{O}(TK^2 + W)$
Model Inference BP - Global	-	✔️	Global	-	global confusion matrix weights	$\mathcal{O}(K^2 + W)$

Comments

The inference of the methods with an explicit model per annotator depends on the participation of the annotators on the labelling process.
- Large number of annotations
An explicit model per annotator could take inference advantage when the individual behavior is quite different from each other.
- While more complex model will overfit to the desired behavior modeling.
The methods with predictive model could take inference advantage when the input patterns are more complex.
The methods without two-step inference (based on backpropagation) could take advantage of a more stable learning.

Usability

Method name	Two-step inference	Predictive model	Setting	Computational scalability	Use case
Label Aggregation	❌	❌	Global	All cases	High density per data
Label Inference EM	✔️	❌	Individual dense	Not scalable with `n_annotators`	High density per annotator
Label Inference EM - Global	✔️	❌	Global	Very large `n_annotators`	High density
Model Inference EM	✔️	✔️	Individual dense	Not scalable with `n_annotators`	High density per annotator
Model Inference EM - Groups	✔️	✔️	Individual sparse	Very large `n_annotators`	High density per annotator
Model Inference EM - Groups Global	✔️	✔️	Global	Very large `n_annotators`	High density per data
Model Inference EM - Global	✔️	✔️	Global	Very large `n_annotators`	High density
Model Inference - Reliability EM	✔️	✔️	Individual dense	Large `n_annotators`	High density per annotator
Model Inference BP	❌	✔️	Individual dense (masked)	Not scalable with `n_annotators`	High density per annotator
Model Inference BP - Global	❌	✔️	Global	Very large `n_annotators`	High density per data

Use case indicates that, the closer the method is to that setting, a better inference is performed. The density refers to the number of annotations per annotator/data/globally.

Comments

The methods without a predictive model are independent of the choice of the learning model, only learns from labels.
- On a second phase these methods could learn f(x) over the inferred ground truth.
The methods with a predictive model depend on the chosen learning model.
- Being able to take advantage of when the input patterns are more complex.
The global methods could be set on the individual setting by changing the representation from individual to global (not vice versa).
The methods without two-step inference are independent of the inference algorithm, where the learning is based in a single optimization framework.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

comparison.md

comparison.md

Learning from Crowds Solutions

Differences in Methodology

Comments

Usability

Comments

Experimental details on the computational scalability can be found on Scalability Comparison

Files

comparison.md

Latest commit

History

comparison.md

File metadata and controls

Learning from Crowds Solutions

Differences in Methodology

Comments

Usability

Comments

Experimental details on the computational scalability can be found on Scalability Comparison