VASS

Semi-Supervised Algorithm Visualizer


A little bit of theory...


Semi-supervised learning is the branch of machine learning that refers to the simultaneous use of data both labeled and unlabeled to perform learning tasks. It sits between supervised and unsupervised learning.

Among all semi-supervised algorithms, the core of this project is based on inductive methods. Their idea is very simple: they try to create a classifier that predicts labels for new data. The proposed algorithms have this objective, albeit with a bit more specificity: wrapper methods. The well-known wrapper methods are based on pseudo-labeling, which is the process where classifiers trained with labeled data generate labels for the unlabeled ones. Once this process is completed, the classifier is retrained, incorporating these new labels.
In the following cards, four of the most representative algorithms of semi-supervised learning are presented: Self-Training, Co-Training, Tri-Training and Democratic Co-Learning.

Internally, each algorithm utilizes one or several classifiers (wrapper methods). Additionally, the number of views on the data is distinguished (a view being the subset of attributes of the dataset that the algorithm uses to learn the model). Unlike Single-view, a Multi-view algorithm views the dataset attributes as multiple subsets. For example, in the Co-Training algorithm, the first classifier might only "see" half of the attributes while the second one sees the other half. Each of them will work with its specific subset of attributes.

Objective


The aim of this tool is to facilitate, through visualizations, the understanding of how the main semi-supervised algorithms actually work when combined with theoretical concepts.

By selecting any of the algorithms, you will be redirected to load the dataset. Subsequently, you can configure the algorithm with the desired parameters and finally obtain a visualization of the training process.
Self-Training
Self-Training
  • One classifier
  • Single-view
Co-Training
Co-Training
  • Two classifiers
  • Multi-view
Tri-Training
Tri-Training
  • Three classifiers
  • Single-view
Democratic Co-Learning
Democratic Co-Learning
  • Three or more classifiers
  • Single-view