Manik Varma, Microsoft Research, India
Extreme Classification: A New Paradigm for Ranking & Recommendation
Abstract: The objective in extreme multi-label classification is to learn a classifier that can automatically tag a data point with the most relevant subset of labels from a large label set. Extreme multi-label classification is an important research problem since not only does it enable the tackling of applications with many labels but it also allows the reformulation of ranking and recommendation problems with certain advantages over existing formulations. Our objective, in this talk, is to develop an extreme multi-label classifier that is faster to train and more accurate at prediction than the state-of-the-art Multi-label Random Forest (MLRF) algorithm [Agrawal et al. WWW 13] and the Label Partitioning for Sub-linear Ranking (LPSR) algorithm [Weston et al. ICML 13]. MLRF and LPSR learn a hierarchy to deal with the large number of labels but optimize task independent measures, such as the Gini index or clustering error, in order to learn the hierarchy. Our proposed FastXML algorithm achieves significantly higher accuracies by directly optimizing an nDCG based ranking loss function. We also develop an alternating minimization algorithm for efficiently optimizing the proposed formulation. Experiments reveal that FastXML can be trained on problems with more than a million labels on a standard desktop in eight hours using a single core and in an hour using multiple cores.
Anna Choromanska, New York University, USA Logarithmic Time Online Multiclass prediction
Abstract: The talk will focus on the multi-class setting with an extremely large number of classes (k), with the goal of obtaining train and test time complexity logarithmic in the number of classes. A reduction of this problem to a set of binary classification problems organized in a tree structure will be discussed. A top-down tree construction approach for constructing logarithmic depth trees will be demonstrated. On the theoretical front, a new objective function will be proposed, which is optimized at each node of the tree and creates dynamic partitions of the data which are both pure (in terms of class labels) and balanced. It will demonstrated that under favorable conditions, the new approach leads to logarithmic depth trees that have leaves with low label entropy. However, the objective function at the nodes is challenging to optimize computationally. The empirical problem is addressed with a new online decision tree construction procedure. Experiments demonstrate that this online algorithm quickly achieves improvement in test error compared to more common logarithmic training time approaches, which makes it a plausible method in computationally constrained large-k applications.
Nicolas Usunier, Facebook Learning embeddings in multi-relational domains
Abstract: One of the challenges in multi-target prediction is that large number of potential targets often come together with fewer representative samples per target and a supervision of lower quality. In that context, algorithms for learning embeddings of inputs and targets in a common low-dimensional vector space have shown great success in multiclass or multilabel prediction and ranking, recommendation, and link prediction in multi-relational data. In this talk, I will present applications of learning embeddings to link prediction and question-answering in knowledge bases. Knowledge bases such as Freebase are large scale repositories of facts about real-world entities in the form of relation triples, that are collaboratively curated. I will show how embeddings models can learn from knowledge bases despite the incomplete and noisy nature of the data, and also how such models can naturally leverage multiple training sources and offer opportunities for transfer learning.