Everything About Multiple Instance Learning

Multiple instance learning papers, codes, and libraries

Multiple Instance Learning (MIL) is a form of weakly supervised learning where training instances are arranged in sets, called bags, and a label is provided for the entire bag, opposed to the instances themselves. This allows to leverage weakly labeled data, which is present in many business problems as labeling data is often costly, like Medical Imaging, Video/Audio, Text, Marketing, and Time Series.

*Inspired by awesome-machine-learning.

In this repository:**

Frameworks and libraries are grouped by programming language.

Research papers are grouped by research field.

Note:

There are numerous papers in this field of research, so this list is not intended to be exhaustive.

Multiple instance learning papers, codes, and libraries Table of Contents 1. Frameworks and Libraries 1.1 Python 1.2 R 1.3 Java 1.4 Scalar 1.5 Julia 2. Research Papers 2.1 Surveys 2.2 Ensemble Learning 2.2.1 General ensemble 2.2.2 Boosting-based 2.2.3 Bagging-based 2.2.4 Cost-sensitive ensemble 2.3 Data resampling 2.3.1 Over-sampling 2.3.2 Under-sampling 2.3.3 Hybrid-sampling 2.4 Cost-sensitive Learning 2.5 Deep Learning 2.5.1 Surveys 2.5.2 Graph Data Mining 2.5.3 Hard example mining 2.5.4 Loss function engineering 2.5.5 Meta-learning 2.5.6 Representation Learning 2.5.7 Posterior Recalibration 2.5.8 Semi/Self-supervised Learning 2.5.9 Curriculum Learning 2.5.10 Two-phase Training 2.5.11 Network Architecture 2.5.12 Deep Generative Model 2.5.13 Imbalanced Regression 2.5.14 Data Augmentation 3. Miscellaneous 3.1 Datasets Cross Validation Indices 3.2 Results 3.3 Codes

1. Frameworks and Libraries

1.1 Python

[Multiple Instance Active Learning for Object Detection][Github][Paper]

NOTE: written in python, easy to use.

[DSMIL: Dual-stream multiple instance learning networks for tumor detection in Whole Slide Image][Github][Paper]

NOTE: written in python, easy to use.

[mil: multiple instance learning library for Python][Github]

NOTE: written in python, easy to use.

1.2 R

smote_variants [Documentation][Github] - A collection of 85 minority over-sampling techniques for imbalanced learning with multi-class oversampling and model selection features (All writen in Python, also support R and Julia).

caret [Documentation][Github] - Contains the implementation of Random under/over-sampling.

ROSE [Documentation] - Contains the implementation of ROSE (Random Over-Sampling Examples).

DMwR [Documentation] - Contains the implementation of SMOTE (Synthetic Minority Over-sampling Technique).

1.3 Java

KEEL [Github][Paper] - KEEL provides a simple GUI based on data flow to design experiments with different datasets and computational intelligence algorithms (paying special attention to evolutionary algorithms) in order to assess the behavior of the algorithms. This tool includes many widely used imbalanced learning techniques such as (evolutionary) over/under-resampling, cost-sensitive learning, algorithm modification, and ensemble learning methods.

NOTE: wide variety of classical classification, regression, preprocessing algorithms included.

1.4 Scalar

undersampling [Documentation][Github] - A Scala library for under-sampling and their ensemble variants in imbalanced classification.

1.5 Julia

smote_variants [Documentation][Github] - A collection of 85 minority over-sampling techniques for imbalanced learning with multi-class oversampling and model selection features (All writen in Python, also support R and Julia).

2. Research Papers

2.1 Surveys

Learning from imbalanced data (IEEE TKDE, 2009, 6000+ citations) [Paper]

Highly cited, classic survey paper. It systematically reviewed the popular solutions, evaluation metrics, and challenging problems in future research in this area (as of 2009).

Learning from imbalanced data: open challenges and future directions (2016, 900+ citations) [Paper]

This paper concentrates on the open issues and challenges in imbalanced learning, i.e., extreme class imbalance, imbalance in online/stream learning, multi-class imbalanced learning, and semi/un-supervised imbalanced learning.

Learning from class-imbalanced data: Review of methods and applications (2017, 900+ citations) [Paper]

A recent exhaustive survey of imbalanced learning methods and applications, a total of 527 papers were included in this study. It provides several detailed taxonomies of existing methods and also the recent trend of this research area.

2.2 Ensemble Learning

2.2.1 General ensemble

Self-paced Ensemble (ICDE 2020, 20+ citations) [Paper][Code][Slides][Zhihu/知乎][PyPI]

NOTE: versatile solution with outstanding performance and computational efficiency.

MESA: Boost Ensemble Imbalanced Learning with MEta-SAmpler (NeurIPS 2020) [Paper][Code][Video][Zhihu/知乎]

NOTE: learning an optimal sampling policy directly from data.

Exploratory Undersampling for Class-Imbalance Learning (IEEE Trans. on SMC, 2008, 1300+ citations) [Paper]

NOTE: simple but effective solution.

EasyEnsemble [Code]
BalanceCascade [Code]

2.2.2 Boosting-based

AdaBoost (1995, 18700+ citations) [Paper][Code] - Adaptive Boosting with C4.5

DataBoost (2004, 570+ citations) [Paper] - Boosting with Data Generation for Imbalanced Data

SMOTEBoost (2003, 1100+ citations) [Paper][Code] - Synthetic Minority Over-sampling TEchnique Boosting

MSMOTEBoost (2011, 1300+ citations) [Paper] - Modified Synthetic Minority Over-sampling TEchnique Boosting

RAMOBoost (2010, 140+ citations) [Paper] [Code] - Ranked Minority Over-sampling in Boosting

RUSBoost (2009, 850+ citations) [Paper] [Code] - Random Under-Sampling Boosting

AdaBoostNC (2012, 350+ citations) [Paper] - Adaptive Boosting with Negative Correlation Learning

EUSBoost (2013, 210+ citations) [Paper] - Evolutionary Under-sampling in Boosting

2.2.3 Bagging-based

Bagging (1996, 20000+ citations) [Paper][Code] - Bagging predictor

Diversity Analysis on Imbalanced Data Sets by Using Ensemble Models (2009, 400+ citations) [Paper]

UnderBagging [Code]
OverBagging [Code]
SMOTEBagging [Code]

2.2.4 Cost-sensitive ensemble

AdaCost (ICML 1999, 800+ citations) [Paper][Code] - Misclassification Cost-sensitive boosting

AdaUBoost (NIPS 1999, 100+ citations) [Paper][Code] - AdaBoost with Unequal loss functions

AsymBoost (NIPS 2001, 700+ citations) [Paper][Code] - Asymmetric AdaBoost and detector cascade

2.3 Data resampling

2.3.1 Over-sampling

ROS [Code] - Random Over-sampling

SMOTE (2002, 9800+ citations) [Paper][Code] - Synthetic Minority Over-sampling TEchnique

Borderline-SMOTE (2005, 1400+ citations) [Paper][Code] - Borderline-Synthetic Minority Over-sampling TEchnique

ADASYN (2008, 1100+ citations) [Paper][Code] - ADAptive SYNthetic Sampling

SPIDER (2008, 150+ citations) [Paper][Code(Java)] - Selective Preprocessing of Imbalanced Data

Safe-Level-SMOTE (2009, 370+ citations) [Paper][Code(Java)] - Safe Level Synthetic Minority Over-sampling TEchnique

SVM-SMOTE (2009, 120+ citations) [Paper][Code] - SMOTE based on Support Vectors of SVM

MDO (2015, 150+ citations) [Paper][Code] - Mahalanobis Distance-based Over-sampling for Multi-Class imbalanced problems.

NOTE: See more over-sampling methods at smote-variants.

2.3.2 Under-sampling

RUS [Code] - Random Under-sampling

CNN (1968, 2100+ citations) [Paper][Code] - Condensed Nearest Neighbor

ENN (1972, 1500+ citations) [Paper] [Code] - Edited Condensed Nearest Neighbor

TomekLink (1976, 870+ citations) [Paper][Code] - Tomek’s modification of Condensed Nearest Neighbor

NCR (2001, 500+ citations) [Paper][Code] - Neighborhood Cleaning Rule

NearMiss-1 & 2 & 3 (2003, 420+ citations) [Paper][Code] - Several kNN approaches to unbalanced data distributions.

CNN with TomekLink (2004, 2000+ citations) [Paper][Code(Java)] - Condensed Nearest Neighbor + TomekLink

OSS (2007, 2100+ citations) [Paper][Code] - One Side Selection

EUS (2009, 290+ citations) [Paper] - Evolutionary Under-sampling

IHT (2014, 130+ citations) [Paper][Code] - Instance Hardness Threshold

2.3.3 Hybrid-sampling

A Study of the Behavior of Several Methods for Balancing Training Data (2004, 2000+ citations) [Paper]

NOTE: extensive experimental evaluation involving 10 different over/under-sampling methods.

SMOTE-Tomek [Code]
SMOTE-ENN [Code]

SMOTE-RSB (2012, 210+ citations) [Paper][Code] - Hybrid Preprocessing using SMOTE and Rough Sets Theory

SMOTE-IPF (2015, 180+ citations) [Paper][Code] - SMOTE with Iterative-Partitioning Filter

2.4 Cost-sensitive Learning

CSC4.5 (2002, 420+ citations) [Paper][Code(Java)] - An instance-weighting method to induce cost-sensitive trees

CSSVM (2008, 710+ citations) [Paper][Code(Java)] - Cost-sensitive SVMs for highly imbalanced classification

CSNN (2005, 950+ citations) [Paper][Code(Java)] - Training cost-sensitive neural networks with methods addressing the class imbalance problem.

2.5 Deep Learning

2.5.1 Surveys

A systematic study of the class imbalance problem in convolutional neural networks (2018, 330+ citations) [Paper]

Survey on deep learning with class imbalance (2019, 50+ citations) [Paper]

NOTE: a recent comprehensive survey of the class imbalance problem in deep learning.

2.5.2 Graph Data Mining

Semi-Supervised Graph Imbalanced Regression (KDD 2023) [Paper] [Code]

TAM: Topology-Aware Margin Loss for Class-Imbalanced Node Classification (ICML 2022) [Paper][Code]

GraphSMOTE: Imbalanced Node Classification on Graphs with Graph Neural Networks (WSDM 2021) [Paper][Code]

Topology-Imbalance Learning for Semi-Supervised Node Classification (NeurIPS 2021) [Paper][Code]

GraphENS: Neighbor-Aware Ego Network Synthesis for Class-Imbalanced Node Classification (ICLR 2022) [Paper][Code]

LTE4G: Long-Tail Experts for Graph Neural Networks (CIKM 2022) [Paper][Code]

Multi-Class Imbalanced Graph Convolutional Network Learning (IJCAI 2020) [Paper]

2.5.3 Hard example mining

Training region-based object detectors with online hard example mining (CVPR 2016, 840+ citations) [Paper][Code] - In the later phase of NN training, only do gradient back-propagation for “hard examples” (i.e., with large loss value)

2.5.4 Loss function engineering

Focal loss for dense object detection (ICCV 2017, 2600+ citations) [Paper][Code (detectron2)][Code (unofficial)] - A uniform loss function that focuses training on a sparse set of hard examples to prevents the vast number of easy negatives from overwhelming the detector during training.

NOTE: elegant solution, high influence.

Training deep neural networks on imbalanced data sets (IJCNN 2016, 110+ citations) [Paper] - Mean (square) false error that can equally capture classification errors from both the majority class and the minority class.

Deep imbalanced attribute classification using visual attention aggregation (ECCV 2018, 30+ citation) [Paper][Code]

Imbalanced deep learning by minority class incremental rectification (TPAMI 2018, 60+ citations) [Paper] - Class Rectification Loss for minimizing the dominant effect of majority classes by discovering sparsely sampled boundaries of minority classes in an iterative batch-wise learning process.

Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss (NIPS 2019, 10+ citations) [Paper][Code] - A theoretically-principled label-distribution-aware margin (LDAM) loss motivated by minimizing a margin-based generalization bound.

Gradient harmonized single-stage detector (AAAI 2019, 40+ citations) [Paper][Code] - Compared to Focal Loss, which only down-weights “easy” negative examples, GHM also down-weights “very hard” examples as they are likely to be outliers.

Class-Balanced Loss Based on Effective Number of Samples (CVPR 2019, 70+ citations) [Paper][Code] - a simple and generic class-reweighting mechanism based on Effective Number of Samples.

Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021) [Paper][Code]

AutoBalance: Optimized Loss Functions for Imbalanced Data (NeurIPS 2021) [Paper]

Label-Imbalanced and Group-Sensitive Classification under Overparameterization (NeurIPS 2021) [Paper][Code]

2.5.5 Meta-learning

Learning to model the tail (NIPS 2017, 70+ citations) [Paper] - Transfer meta-knowledge from the data-rich classes in the head of the distribution to the data-poor classes in the tail.

Learning to reweight examples for robust deep learning (ICML 2018, 150+ citations) [Paper][Code] - Implicitly learn a weight function to reweight the samples in gradient updates of DNN.

NOTE: representative work to solve the class imbalance problem through meta-learning.

Meta-weight-net: Learning an explicit mapping for sample weighting (NIPS 2019) [Paper][Code] - Explicitly learn a weight function (with an MLP as the function approximator) to reweight the samples in gradient updates of DNN.

Learning Data Manipulation for Augmentation and Weighting (NIPS 2019) [Paper][Code]

Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks (ICLR 2020) [Paper][Code]

MESA: Boost Ensemble Imbalanced Learning with MEta-SAmpler (NeurIPS 2020) [Paper][Code][Video]

NOTE: meta-learning-powered ensemble learning

2.5.6 Representation Learning

Learning deep representation for imbalanced classification (CVPR 2016, 220+ citations) [Paper]

Supervised Class Distribution Learning for GANs-Based Imbalanced Classification (ICDM 2019) [Paper]

Decoupling Representation and Classifier for Long-tailed Recognition (ICLR 2020) [Paper][Code]

NOTE: interesting findings on representation learning and classifier learning

Supercharging Imbalanced Data Learning With Energy-based Contrastive Representation Transfer (NeurIPS 2021) [Paper]

Tailoring Self-Supervision for Supervised Learning (ECCV 2022) [Paper][Code]

2.5.7 Posterior Recalibration

Posterior Re-calibration for Imbalanced Datasets (NeurIPS 2020) [Paper][Code]

Long-tail learning via logit adjustment (ICLR 2021) [Paper][Code]

2.5.8 Semi/Self-supervised Learning

Rethinking the Value of Labels for Improving Class-Imbalanced Learning (NeurIPS 2020) [Paper][Code][Video]

NOTE: semi-supervised training / self-supervised pre-training helps imbalance learning

Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised Learning (NeurIPS 2020) [Paper][Code]

ABC: Auxiliary Balanced Classifier for Class-imbalanced Semi-supervised Learning (NeurIPS 2021) [Paper][Code]

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling (NeurIPS 2021) [Paper]

DASO: Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning (CVPR 2022) [Paper][Code]

2.5.9 Curriculum Learning

Dynamic Curriculum Learning for Imbalanced Data Classification (ICCV 2019) [Paper]

2.5.10 Two-phase Training

Brain tumor segmentation with deep neural networks (2017, 1200+ citations) [Paper][Code (unofficial)]

Pre-training on balanced dataset, fine-tuning the last output layer before softmax on the original, imbalanced data.

2.5.11 Network Architecture

BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition (CVPR 2020) [Paper][Code]

Class-Imbalanced Deep Learning via a Class-Balanced Ensemble (TNNLS 2021) [Paper]

2.5.12 Deep Generative Model

Deep Generative Model for Robust Imbalance Classification (CVPR 2020) [Paper]

2.5.13 Imbalanced Regression

Semi-Supervised Graph Imbalanced Regression (KDD 2023) [Paper] [Code]

RankSim: Ranking Similarity Regularization for Deep Imbalanced Regression (ICML 2022) [Paper] [Code]

Balanced MSE for Imbalanced Visual Regression (CVPR 2022) [Paper] [Code]

Delving into Deep Imbalanced Regression (ICML 2021) [Paper][Code][Video]

Density-based weighting for imbalanced regression (Machine Learning [J], 2021) [Paper][Code]

2.5.14 Data Augmentation

Minority-Oriented Vicinity Expansion with Attentive Aggregation for Video Long-Tailed Recognition (AAAI 2023) [Paper][Code]

3. Miscellaneous

3.1 Datasets

Common MIL datasets

Application areas of the datasets are molecular activity prediction, image annotation, text categorization, webpage classification and audio-recording classification Each dataset file is a comma-separated value (CSV) formatted file which has number of instances many rows and number of features many columns together with two additionally attached columns. The first attached column corresponds to the bag class labels which are propagated to the instances. The second column is the bag ID column where each instance receives the bag ID number of its owner bag. The remaining columns individually store the feature values of the instances.

Full table of the datasets: MIL_datasets

MIL_datasets.pdf

171.1KB

Link: https://www.dropbox.com/home/Real_world_datasets

MIL algorithms are tested on 71 MIL benchmark datasets. This is the largest experimented MIL repository for algorithm comparison. Application areas of the datasets are molecular activity prediction, image annotation, text categorization, webpage classification and audio-recording classification (.mat files of the datasets are provided on miproblems.org).

Each dataset file is a comma-separated value (CSV) formatted file which has number of instances many rows and number of features many columns together with two additionally attached columns. The first attached column corresponds to the bag class labels which are propagated to the instances. The second column is the bag ID column where each instance receives the bag ID number of its owner bag. The remaining columns individually store the feature values of the instances.

Full table of the datasets: [MIL_datasets]

Link for the datasets: [Real_world_datasets]

PASCAL VOC 2007 dataset in MIL format:

The original natural image classification and object detection dataset can be downloaded from the original source page: http://host.robots.ox.ac.uk/pascal/VOC/voc2007/.

This dataset formed as a MIL problem by Dr. Melih Kandemir and Manuel Haussmann. Their corresponding paper for citation is:

M. Haußmann, F.A. Hamprecht, M. Kandemir, “Variational Bayesian multiple instance learning with Gaussian processes”, CVPR, (2017).

Link for the PASCAL VOC 2007 MIL dataset: [pvoc_2007_dataset]

Synthetic datasets:

These datasets are randomly generated based on four different MI-settings and can be used in MIL algorithms to measure the effects of different levels of number of bags, average number of instances per bag and number of features.

Link for the datasets: [Synthetic_datasets]

Pseudo-synthetic datasets:

Based on Elephant dataset, datasets with different levels of number of bags and number of features are generated to test MIL algorithms.

Link for the datasets: [Pseudo-synthetic_datasets]

Cross Validation Indices

The proposed MIL approaches and the state-of-the-art methods are tested by repeating a ten-fold cross validation for five times. Randomly generated cross validation indices of the experimented datasets below can be used to reproduce the experimental results to make comparisons.

Real world datasets: [CV_indices]

3.2 Results

Avg. of AUC*	APR	CCE	Citation-KNN	Dmaxmin	Dmeanmin	Dminmin	MILES	miFV	RSIS	Kmeans-enc	Kmeans-2Class-enc	Path-enc	Term-enc	LP-instance	LP-cluster	QP-MIL
Musk1	0.7720	0.8816	0.8712	0.9200	0.9446	0.9312	0.8960	0.8738	0.8767	0.9348	0.9412	0.9558	0.9574	0.9568	0.9680	0.9680
Musk2	0.8064	0.7812	0.8501	0.9560	0.9761	0.9563	0.8227	0.7860	0.8818	0.9671	0.9469	0.9327	0.9279	0.9306	0.9269	0.9450
Mutagenesis1	0.5007	0.8328	0.8268	0.8196	0.8506	0.7655	0.9096	0.9091	0.6649	0.7802	0.8875	0.9355	0.9374	0.8525	0.8674	0.8554
Mutagenesis2	0.4583	0.7267	0.6875	0.5900	0.6467	0.3450	0.8817	0.8667	ㅤ	0.6267	0.6833	0.8350	0.8500	0.7883	0.7850	0.7850
Protein	0.5089	0.6434	0.5516	0.5611	0.5226	0.8756	0.8723	0.8732	ㅤ	0.9528	0.7996	0.9456	0.9070	–	0.8394	0.8510
Elephant	0.7280	0.8535	0.8824	0.8762	0.9360	0.9148	0.8858	0.8818	0.8400	0.9266	0.9136	0.9435	0.9358	0.9488	0.9046	0.9414
Fox	0.5850	0.6225	0.5689	0.4574	0.6124	0.7039	0.6304	0.6546	0.6400	0.7376	0.6748	0.7116	0.7231	0.6862	0.6424	0.6766
Tiger	0.5830	0.8323	0.7521	0.7610	0.8528	0.8498	0.8264	0.8597	0.7930	0.8680	0.8754	0.9095	0.9027	0.9054	0.8930	0.9008
CorelAfrican	0.5795	0.8398	0.8615	0.9386	0.9673	0.9664	0.8363	0.8512	ㅤ	0.9573	0.9436	0.9721	0.9725	0.9452	0.9320	0.9534
CorelAntique	0.5945	0.7216	0.7466	0.8769	0.9223	0.9103	0.7694	0.8006	ㅤ	0.8728	0.9083	0.9335	0.9262	0.8937	0.9001	0.8706
CorelBattleships	0.5689	0.9123	0.8661	0.9723	0.9806	0.9656	0.8763	0.8893	ㅤ	0.9494	0.9288	0.9745	0.9745	0.9333	0.9523	0.9584
CorelBeach	0.5956	0.9694	0.9262	0.9719	0.9831	0.9902	0.9766	0.9804	ㅤ	0.9927	0.9736	0.9915	0.9907	0.9949	0.9882	0.9932
CorelBuses	0.5908	0.9630	0.8475	0.9622	0.9731	0.9805	0.9439	0.9444	ㅤ	0.9753	0.9401	0.9818	0.9799	0.9792	0.9627	0.9598
CorelCars	0.6076	0.8779	0.8353	0.9299	0.9476	0.9164	0.8560	0.8641	ㅤ	0.9185	0.9169	0.9513	0.9475	0.9462	0.9260	0.9124
CorelDesserts	0.5708	0.9220	0.8655	0.9498	0.9740	0.9658	0.9151	0.9433	ㅤ	0.9389	0.9732	0.9748	0.9711	0.9877	0.9594	0.9685
CorelDinosaurs	0.5508	0.8973	0.8020	0.9484	0.9828	0.9819	0.8823	0.9161	ㅤ	0.9753	0.9437	0.9734	0.9691	0.9851	0.9527	0.9638
CorelDogs	0.5697	0.8050	0.7515	0.8936	0.9185	0.9107	0.8381	0.8562	ㅤ	0.8737	0.8644	0.9300	0.9207	0.9237	0.8862	0.8633
CorelElephants	0.6431	0.8931	0.8696	0.9661	0.9825	0.9696	0.8583	0.8841	ㅤ	0.9575	0.9572	0.9740	0.9712	0.9702	0.9644	0.9676
CorelFashion	0.8681	0.9501	0.9094	0.9833	0.9898	0.9912	0.9044	0.9280	ㅤ	0.9902	0.9888	0.9932	0.9924	0.9887	0.9809	0.9861
CorelFlowers	0.6358	0.8438	0.8332	0.9318	0.9470	0.9526	0.8562	0.8897	ㅤ	0.9435	0.9378	0.9724	0.9729	0.9622	0.9377	0.9382
CorelFood	0.8596	0.9848	0.9337	0.9928	0.9977	0.9957	0.9794	0.9838	ㅤ	0.9942	0.9868	0.9958	0.9948	0.9982	0.9833	0.9923
CorelHistorical	0.7731	0.9777	0.9416	0.9844	0.9982	0.9938	0.9845	0.9865	ㅤ	0.9930	0.9852	0.9964	0.9954	0.9983	0.9878	0.9937
CorelHorses	0.6131	0.8091	0.7455	0.8607	0.9196	0.9172	0.7982	0.8154	ㅤ	0.8955	0.8893	0.9051	0.9063	0.9065	0.8928	0.8603
CorelLizards	0.5531	0.9380	0.8973	0.9604	0.9802	0.9711	0.9399	0.9340	ㅤ	0.9700	0.9580	0.9773	0.9763	0.9708	0.9574	0.9730
CorelMountains	0.9812	0.9875	0.9790	0.9976	0.9996	0.9990	0.9880	0.9926	ㅤ	0.9989	0.9990	0.9999	0.9999	0.9988	0.9967	0.9980
CorelSkiing	0.5085	0.8747	0.7700	0.9563	0.9604	0.9528	0.8670	0.8729	ㅤ	0.9472	0.9585	0.9696	0.9648	0.9690	0.9312	0.9599
CorelSunset	0.4951	0.7279	0.7064	0.8033	0.8369	0.7512	0.7039	0.7507	ㅤ	0.7630	0.7711	0.8634	0.8618	0.8044	0.8313	0.7389
CorelWaterfalls	0.5947	0.8360	0.8646	0.9620	0.9751	0.9662	0.8388	0.8957	ㅤ	0.9452	0.9341	0.9759	0.9725	0.9701	0.9542	0.9550
UCSBBreastCancer	0.5692	0.6439	0.7063	0.7253	0.8306	0.7908	0.8228	0.8478	ㅤ	0.8333	0.8678	0.8803	0.8594	0.9300	0.9025	0.8878
Newsgroups1	0.5000	0.7920	0.8032	0.9048	0.9408	0.5000	0.6672	0.6536	0.6200	0.4160	0.9114	0.8612	0.7804	0.4696	0.6680	0.9328
Newsgroups2	0.5080	0.6600	0.6380	0.8952	0.8976	0.5544	0.7180	0.6136	0.7200	0.5280	0.5720	0.6708	0.6168	0.6096	0.5040	0.8312
Newsgroups3	0.5000	0.6220	0.5844	0.8184	0.8104	0.5000	0.6504	0.6600	ㅤ	0.4792	0.6680	0.7456	0.6916	0.4456	0.6336	0.7784
Newsgroups4	0.5080	0.6784	0.6364	0.8216	0.8568	0.4792	0.7040	0.6632	ㅤ	0.6816	0.6948	0.7524	0.6804	0.5304	0.5648	0.8200
Newsgroups5	0.4880	0.6092	0.5852	0.8528	0.8520	0.5592	0.6492	0.6376	ㅤ	0.5888	0.6504	0.7404	0.7300	0.5056	0.6464	0.8120
Newsgroups6	0.5000	0.7416	0.7324	0.8652	0.8904	0.5720	0.6756	0.6504	ㅤ	0.6096	0.8218	0.8616	0.8508	0.5952	0.5776	0.8488
Newsgroups7	0.4960	0.6476	0.6328	0.7520	0.7896	0.5468	0.6452	0.6540	ㅤ	0.5112	0.7259	0.7788	0.7708	0.5352	0.5688	0.8160
Newsgroups8	0.4980	0.6920	0.5704	0.8404	0.8704	0.4600	0.7132	0.6232	ㅤ	0.4976	0.7275	0.8028	0.7732	0.4848	0.4296	0.8184
Newsgroups9	0.5000	0.8064	0.8176	0.3480	0.3260	0.5000	0.6988	0.6160	ㅤ	0.5216	0.8119	0.8912	0.9000	0.6296	0.4384	0.8504
Newsgroups10	0.5000	0.7472	0.8212	0.9184	0.9144	0.4764	0.6244	0.6516	ㅤ	0.5096	0.8636	0.8800	0.8724	0.6432	0.4976	0.8800
Newsgroups11	0.4920	0.7936	0.7140	0.9680	0.9576	0.4600	0.6672	0.6344	ㅤ	0.3744	0.8792	0.9384	0.9292	0.4904	0.4584	0.9232
Newsgroups12	0.5100	0.7556	0.8064	0.8680	0.8396	0.4664	0.6340	0.6416	ㅤ	0.4624	0.8510	0.8016	0.8288	0.5216	0.5552	0.8648
Newsgroups13	0.5000	0.5284	0.6212	0.9320	0.9464	0.5000	0.4684	0.6324	ㅤ	0.4872	0.6160	0.6760	0.6416	0.4584	0.4880	0.9488
Newsgroups14	0.5000	0.7768	0.7800	0.9216	0.9416	0.4648	0.6656	0.6408	ㅤ	0.4976	0.8428	0.8888	0.7980	0.6120	0.4680	0.8880
Newsgroups15	0.5000	0.7900	0.7628	0.8744	0.9052	0.5172	0.6208	0.6440	ㅤ	0.4360	0.8288	0.8984	0.9032	0.4304	0.5160	0.9208
Newsgroups16	0.5000	0.7872	0.7728	0.8792	0.8984	0.5088	0.6404	0.6268	ㅤ	0.4608	0.8486	0.8220	0.7812	0.4160	0.4368	0.8216
Newsgroups17	0.4940	0.7660	0.6920	0.8184	0.8736	0.5312	0.6800	0.6204	ㅤ	0.4984	0.8272	0.7612	0.7400	0.4160	0.5080	0.7960
Newsgroups18	0.5000	0.8248	0.8448	0.8328	0.8736	0.4632	0.6880	0.6656	ㅤ	0.5456	0.8576	0.8820	0.8576	0.5672	0.4904	0.8520
Newsgroups19	0.5020	0.8312	0.7656	0.7848	0.8020	0.5608	0.6420	0.6380	ㅤ	0.5496	0.6717	0.7472	0.7388	0.5152	0.5080	0.8136
Newsgroups20	0.4980	0.7696	0.6464	0.8392	0.8336	0.4432	0.6252	0.6424	ㅤ	0.5600	0.8086	0.7540	0.7056	0.3856	0.6192	0.8128
Web1	0.5470	0.7883	0.6135	0.4107	0.6337	0.7883	0.8263	0.8337	0.7422	0.7323	0.8323	0.7498	0.7603	0.7590	0.6417	0.6553
Web2	0.5223	0.4733	0.4450	0.5013	0.4743	0.5213	0.6997	0.6933	ㅤ	0.5440	0.3705	0.4680	0.4920	0.4630	0.6467	0.5373
Web3	0.6000	0.6032	0.6491	0.5010	0.7083	0.6081	0.7721	0.7668	ㅤ	0.6714	0.7333	0.6994	0.6643	0.6445	0.6219	0.6612
Web4	0.5750	0.8340	0.7070	0.5480	0.7990	0.6287	0.8165	0.8465	ㅤ	0.7433	0.8120	0.7763	0.8243	0.7410	0.6043	0.6273
Web5	0.5398	0.6116	0.5064	0.5066	0.7112	0.7941	0.7313	0.7755	ㅤ	0.7433	0.6869	0.7326	0.7337	0.7319	0.5338	0.5524
Web6	0.5810	0.6212	0.5200	0.4942	0.5247	0.5430	0.7222	0.7268	ㅤ	0.5497	0.6460	0.6647	0.7227	0.5637	0.4173	0.6500
Web7	0.5858	0.5921	0.6748	0.5998	0.6904	0.6700	0.6896	0.6879	ㅤ	0.6246	0.6971	0.7290	0.7048	0.6429	0.4613	0.5454
Web8	0.5517	0.5750	0.4981	0.5785	0.4090	0.3663	0.6521	0.6496	ㅤ	0.5113	0.5370	0.5283	0.5615	0.5067	0.4688	0.5296
Web9	0.5950	0.6308	0.5990	0.4967	0.7354	0.6867	0.7298	0.7413	ㅤ	0.6867	0.6846	0.4594	0.4817	0.4404	0.4546	0.5029
BrownCreeper	0.5920	0.9445	0.8825	0.7292	0.8991	0.9272	0.9890	0.9924	ㅤ	0.9737	0.9882	0.9944	0.9935	0.9943	0.9841	0.9905
Chestnut-backedChickadee	0.5197	0.8267	0.8021	0.8305	0.8530	0.7488	0.8982	0.9099	ㅤ	0.8014	0.9232	0.9434	0.9413	0.9390	0.8880	0.9169
Dark-eyedJunco	0.6829	0.6674	0.5935	0.6952	0.8559	0.8701	0.9380	0.9474	ㅤ	0.8907	0.8815	0.9782	0.9751	0.9543	0.9342	0.9315
HammondsFlycatcher	0.5340	0.9922	0.8843	0.7178	0.9439	0.8826	0.9992	0.9989	ㅤ	0.9394	0.9404	1.0000	1.0000	1.0000	1.0000	0.9995
HermitThrush	0.6681	0.4879	0.5351	0.5696	0.5776	0.8924	0.8077	0.8363	ㅤ	0.6824	0.6619	0.9214	0.9240	0.9392	0.9090	0.9032
HermitWarbler	0.5934	0.8178	0.6973	0.7346	0.7810	0.9261	0.9812	0.9789	ㅤ	0.9039	0.9403	0.9864	0.9868	0.9856	0.9822	0.9844
Olive-sidedFlycatcher	0.6441	0.8781	0.7938	0.8525	0.8964	0.9236	0.9693	0.9626	ㅤ	0.9202	0.9591	0.9766	0.9737	0.9744	0.9616	0.9670
PacificslopeFlycatcher	0.5306	0.8363	0.6992	0.7228	0.7536	0.7691	0.9572	0.9578	ㅤ	0.8481	0.9858	0.9675	0.9644	0.9660	0.9453	0.9428
Red-breastedNuthatch	0.5829	0.8719	0.7708	0.8033	0.8758	0.8770	0.9789	0.9819	ㅤ	0.9073	0.9463	0.9932	0.9940	0.9848	0.9470	0.9710
SwainsonsThrush	0.5188	0.8260	0.6867	0.7818	0.7670	0.8841	0.9682	0.9810	ㅤ	0.8044	0.9140	0.9982	0.9964	0.9876	0.9455	0.9765
VariedThrush	0.5844	0.8686	0.7816	0.7509	0.8397	0.9402	0.9997	0.9998	ㅤ	0.9513	0.9302	1.0000	1.0000	0.9998	0.9964	0.9965
WesternTanager	0.6957	0.8780	0.7551	0.4749	0.8493	0.8239	0.9877	0.9824	ㅤ	0.8938	0.9892	0.9954	0.9945	0.9915	0.9699	0.9729
WinterWren	0.5984	0.9654	0.9068	0.9441	0.9315	0.8537	0.9925	0.9902	ㅤ	0.9461	0.9967	0.9973	0.9975	0.9917	0.9853	0.9876

10 fold cross-validation is repeated 5 times.

3.3 Codes

Algorithm	Code
APR	[MIL toolbox]
Citation-KNN	[MIL toolbox]
MILES	[MIL toolbox]
CCE	[CCE_code]
MInD	[MIL toolbox]
RSIS	[RSIS_code]
miFV	[miFV_code]
Bag Encoding	[Python_codes] [R_codes]
LP-MIL	[Python_codes]
QP-MIL	ㅤ