Business Domains

Additional DS Skills

Powered By GitBook

Label Algorithms

Unbalanced labels

1.

βimbalance learn - is an open-source, MIT-licensed library that provides tools when dealing with classification with imbalanced classes.

2.

βClassifying Job Titles With Noisy Labels Using REINFORCE this article has a very nice trick in adding a reward component to the loss function in order to mitigate for unbalanced class label problem, instead of the usual balancing.

Imbalance Learn comparison

Label Propagation / Spreading

Note: very much related to weakly and semi supervision, i.e., we have small amounts of labels and we want to generalize the labels to other samples, see also weak supervision methods.

1.

Step 1: build a laplacian graph using KNN, distance metric is minkowski with p=2, i.e. euclidean distance.

2.

3.

βSpreading (propagation upgrade), Essentially a community graph algorithm, however it resembles KNN in its nature, using semi supervised data set (i.e., labeled and unlabeled data) to spread or propagate labels to unlabeled data, with small incrementations in the algorithm, using KNN-like methodology, each unlabeled sample will be given a label based on its 1st order friends, if there is a tie, a random label is chosen. Nodes are connected by using a euclidean distance.

4.

6.

7.

10.

1.

2.

1.

Harmonic Function (HMN) [Zhu+, ICML03]

2.

Local and Global Consistency (LGC) [Zhou+, NIPS04]

3.

Partially Absorbing Random Walk (PARW) [Wu+, NIPS12]

4.

OMNI-Prop (OMNIProp) [Yamaguchi+, AAAI15]

5.

Confidence-Aware Modulated Label Propagation (CAMLP) [Yamaguchi+, SDM16]

3.

β

Last modified 1mo ago

Copy link