Machine learning should benefit to the whole world, especially for developing countries in Africa and Asia. When dataset sizes grow bigger, it is laborious and expensive to obtain clean supervision, especially for developing countries. As a result, the volume of noisy supervision becomes enormous, e.g., web-scale image and speech data with noisy labels. However, standard machine learning assumes that the supervised information is fully clean and intact. Therefore, noisy data harms the performance of most of the standard learning algorithms, and sometimes even makes existing algorithms break down. There are a brunch of theories and approaches proposed to deal with noisy data. As far as we know, label-noise learning spans over two important ages in machine learning: statistical learning (i.e., shallow learning) and deep learning. In the age of statistical learning, label-noise learning focused on designing noise-tolerant losses or unbiased risk estimators. Nonetheless, in the age of deep learning, label-noise learning has more options to combat with noisy labels, such as designing biased risk estimators or leveraging memorization effects of deep networks. In this tutorial, we summarize the foundations and go through the most recent noisy-supervision-tolerant techniques. By participating the tutorial, the audience will gain a broad knowledge of label-noise learning from the viewpoint of statistical learning theory, deep learning, detailed analysis of typical algorithms and frameworks, and their real-world applications in industry.
The following schedule is in .
|Title: Overview of Learning with Noisy Supervision|
|Speaker: Masashi Sugiyama|
|Title: Statistical Learning with Noisy Supervision|
|Speaker: Tongliang Liu|
|Title: Deep Learning with Noisy Supervision|
|Speaker: Bo Han|
|Title: Automated Learning from Noisy Supervision|
|Speaker: Quanming Yao|
|Title: Beyond Class-Conditional Noise|
|Speaker: Gang Niu|
Part 1: Overview of Learning with Noisy Supervision
Part 2: Statistical Learning with Noisy Supervision
Part 3: Deep Learning with Noisy Supervision
Part 4: Automated Learning from Noisy Supervision
Part 5: Beyond Class-Conditional Noise
Bo Han, Hong Kong Baptist University, Hong Kong SAR, China.
Tongliang Liu, The University of Sydney, Australia.
Quanming Yao, Tsinghua University, China.
Gang Niu, RIKEN, Japan.
Masashi Sugiyama, RIKEN / University of Tokyo, Japan.
Due to the space limitation, we only list highly-related papers. The full reference list can be found here.
B. Han, Q. Yao, T. Liu, G. Niu, I. W. Tsang, J. T. Kwok, and M. Sugiyama. A Survey of Label-noise Representation Learning: Past, Present and Future. arXiv preprint arXiv:2011.04406, 2020.
N. Natarajan, I.S. Dhillon, P. K. Ravikumar, and A. Tewari. Learning with Noisy Labels. In NeurIPS, 2013.
T. Liu and D. Tao. Classification with Noisy Labels by Importance Reweighting. IEEE Transactions on Pattern Analysis and MachineIntelligence, 38(3): 447-461, 2015.
G. Patrini, A. Rozza, A. K. Menon, R. Nock, and L. Qu. Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach. In CVPR, 2017.
L. Jiang, Z. Zhou, T. Leung, L.-J. Li, and L. Fei-Fei. Mentornet: Learning Data-driven Curriculum for very Deep Neural Networks on Corrupted Labels. In ICML, 2018.
B. Han, Q. Yao, X. Yu, G. Niu, M. Xu, W. Hu, I. W. Tsang, and M. Sugiyama. Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels. In NeurIPS, 2018.
B. Han, J. Yao, G. Niu, M. Zhou, I. W. Tsang, Y. Zhang, and M. Sugiyama. Masking: A New Perspective of Noisy Supervision. In NeurIPS, 2018.
Q. Yao, and M. Wang. Taking Human out of Learning Applications: A Survey on Automated Machine Learning. arXiv preprint arXiv:1810.13306, 2018.
X. Yu, B. Han, J. Yao, G. Niu, I. W. Tsang, and M. Sugiyama. How does Disagreement Help Generalization against Label Corruption? In ICML, 2019.
X. Xia, T. Liu, N. Wang, B. Han, C. Gong, G. Niu, and M. Sugiyama. Are Anchor Points Really Indispensable in Label-Noise Learning? In NeurIPS, 2019.
Y. Yao, T. Liu, B. Han, M. Gong, J. Deng, G. Niu, and M. Sugiyama. Dual T: Reducing Estimation Error for Transition Matrix in Label-noise Learning. In NeurIPS, 2020.
X. Xia, T. Liu, B. Han, N. Wang, M. Gong, H. Liu, G. Niu, D. Tao, and M. Sugiyama. Part-dependent Label Noise: Towards Instance-dependent Label Noise. In NeurIPS, 2020.
J. Cheng, T. Liu, K. Rao, and D. Tao. Learning with Bounded Instance-and Label-dependent Label Noise. In ICML, 2020.
B. Han, G. Niu, X. Yu, Q. Yao, X. Miao, I. W. Tsang, and M. Sugiyama. SIGUA: Forgetting May Make Learning with Noisy Labels More Robust. In ICML, 2020.
Y. Zhang, Q. Yao, and L. Chen. Interstellar: Searching Recurrent Architecture for Knowledge Graph Embedding. In NeurIPS, 2020.
Q. Yao, H. Yang, B. Han, G. Niu, and J. T. Kwok. Searching to Exploit Memorization Effect in Learning from Noisy Labels. In ICML, 2020.
A. K. Menon, A. S. Rawat, S. J. Reddi, and S. Kumar. Can Gradient Clipping Mitigate Label Noise? In ICLR, 2020.
Q. Yao, J. Xu, W. Tu, and Z. Zhu. Efficient Neural Architecture Search via Proximal Iterations. In AAAI, 2020.
H. Cheng, Z. Zhu, X. Li, Y. Gong, X. Sun, and Y. Liu. Learning with Instance-dependent Label Noise: A Sample Sieve Approach. In ICLR, 2021.