Accéder directement au contenu Accéder directement à la navigation
Communication dans un congrès

Cross-task weakly supervised learning from instructional videos

Abstract : In this paper we investigate learning visual models for the steps of ordinary tasks using weak supervision via instructional narrations and an ordered list of steps instead of strong supervision via temporal annotations. At the heart of our approach is the observation that weakly supervised learning may be easier if a model shares components while learning different steps: "pour egg" should be trained jointly with other tasks involving "pour" and "egg". We formalize this in a component model for recognizing steps and a weakly supervised learning framework that can learn this model under temporal constraints from narration and the list of steps. Past data does not permit systematic studying of sharing and so we also gather a new dataset, CrossTask, aimed at assessing cross-task sharing. Our experiments demonstrate that sharing across tasks improves performance, especially when done at the component level and that our component model can parse previously unseen tasks by virtue of its compositionality.
Liste complète des métadonnées

Littérature citée [40 références]  Voir  Masquer  Télécharger
Contributeur : Dimitri Zhukov <>
Soumis le : vendredi 10 janvier 2020 - 12:44:21
Dernière modification le : jeudi 30 janvier 2020 - 07:43:34
Document(s) archivé(s) le : samedi 11 avril 2020 - 15:52:40


Fichiers produits par l'(les) auteur(s)


  • HAL Id : hal-02434806, version 1



Dimitri Zhukov, Jean-Baptiste Alayrac, Ramazan Cinbis, David Fouhey, Ivan Laptev, et al.. Cross-task weakly supervised learning from instructional videos. CVPR 2019 - IEEE Conference on Computer Vision and Pattern Recognition, Jun 2019, Long Beach, CA, United States. ⟨hal-02434806⟩