On Missing Labels, Long-Tails, and Propensities in Extreme Multi-Label Classification

June 8, 2022

Abstract

The propensity model introduced by Jain et al. 2016 has become a standard approach for dealing with missing and long-tail labels in extreme multi-label classification (XMLC). In this paper, we critically revise this approach showing that despite its theoretical soundness, its application in contemporary XMLC works is debatable. We exhaustively discuss the flaws of the propensity-based approach, and present several recipes, some of them related to solutions used in search engines and recommender systems, that we believe constitute promising alternatives to be followed in XMLC.

Download

Publication Type

Paper

Conference / Journal Name

KDD 2022

Authors

Erik Schultheis

Marek Wydmuch

Rohit Babbar

Krzysztof Dembczynski

BibTeX


@inproceedings{
    author = {},
    title = {‌On Missing Labels, Long-Tails, and Propensities in Extreme Multi-Label Classification‌},
    booktitle = {Proceedings of KDD 2022‌},
    year = {‌2022‌}
}