MULTI-VIEW DYNAMIC FACIAL ACTION UNIT DETECTION

A. ROMERO, J. LEÓN AND P. ARBELÁEZ

IMAGE AND VISION COMPUTING

Abstract

We propose a novel convolutional neural network approach to address the fine-grained recognition problem of multi-view dynamic facial action unit detection. We leverage recent gains in large-scale object recognition by formulating the task of predicting the presence or absence of a specific action unit in a still image of a human face as holistic classification. We then explore the design space of our approach by considering both shared and independent representations for separate action units, and also different CNN architectures for combining color and motion information. We then move to the novel setup of the FERA 2017 Challenge, in which we propose a multi-view extension of our approach that operates by first predicting the viewpoint from which the video was taken, and then evaluating an ensemble of action unit detectors that were trained for that specific viewpoint. Our approach is holistic, efficient, and modular, since new action units can be easily included in the overall system. Our approach significantly outperforms the baseline of the FERA 2017 Challenge, with an absolute improvement of 14% on the F1-metric. Additionally, it compares favorably against the winner of the FERA 2017 Challenge.

Figure 1. Overview of AUNets. Our system takes as input a video of a human head and computes its optical flow field. It predicts the viewpoint from which the video was taken, and uses this information to select and evaluate an ensemble of holistic action unit detectors that were trained for that specific view. Final AUNets predictions are then temporally smoothed.

Different arrangement for the optical flow


Results


Table 1. Comparison with state-of-the-art methods over BP4D dataset.
Table 2. Comparison with baseline [86] and official winner [82] of the FERA17 Challenge.
Figure 3. Zeiler’s method [49] for network visualization. The top row presents 5 different Action Units [7]. The heat maps highlight the most important regions in the human face for each specific Action Unit (blue: less important, red: more important).
AU1246710121415172324Av.
F153.444.755.879.278.183.188.466.647.562.047.349.763.0
Table 3. Quantitative results of our method.

Universidad de los Andes | Monitored by Mineducación
Recognition as University: Decree 1297 of May 30th, 1964.
Recognition as legal entity: Resolution 28 of February 23, 1949 Minjusticia.

© Universidad de los Andes. All rights reserved.