This line of research focuses on the intersection of computer vision and natural language understanding. In particular, we study tasks that require visual input in the form of images or video as well as linguistic input in the form of text or audio. We aim at designing novel methods and algorithms that can jointly process these diverse and dissimilar types of information.

Dynamic Multimodal Instance Segmentation Guided by Natural Language Queries

E Margffoy-Tuay, JC Pérez, E Botero, P Arbeláez

To appear at ECCV 2018

  • Address:
    Cra. 1 E No. 19A - 40. 111711, Bogotá, Colombia - Mario Laserna Building - School of Engineering - Universidad de Los Andes
  • Phone:
    [571] 332 4327, 332 4328, 332 4329