Motivations and rationale
Feature extraction for visual recognition has undergone major changes in recent times. During the last ten years, in particular, Convolutional Neural Networks (CNN) first, and then Fundamental models, have been progressively replacing the traditional approach – that is, feature engineering. It is largely accepted that these models, when properly trained, can achieve better accuracy than hand-crafted methods; however, this comes at the cost of little or no interpretability of the visual features generated.
Explainability methods are being investigated in an attempt to fill the performance-interpretability gap, but their development happens at a much slower pace than task-oriented models. The resulting lack of explainability (‘black-box’ approach) raises important concerns in terms of:
All the above issues represent major obstacles in some applications, as for instance medical image analysis. More generally, it is worth asking whether pushing performance is worth the risk of creating overcomplicated, difficult to interpret and possibly unstable solutions/technologies.
Aims & topics
The aim of this special session is to provide a forum to discuss theoretical aspects and practical implications of the performance vs. interpretability dilemma in visual recognition. We welcome contributions in terms of research articles, reviews, position papers and comparative evaluations. Topics of interest include, but are not limited to:
The conference proceedings will be published in Elsevier's Procedia Computer Science open access journal, available in ScienceDirect and submitted to be indexed/abstracted in CPCi (ISI conferences and part of Web of Science), Engineering Index, and Scopus