We propose SARMA (Saliency Analysis with a Residual Mesh-based Architecture), a graph neural network designed to predict visual saliency on 3D surface meshes. Unlike traditional methods that rely on handcrafted geometric features, SARMA learns saliency patterns in an end-to-end fashion using residual Feature-Steered Graph Convolution (FeaStConv) layers. The network takes a mesh as input, represented as a graph of vertices and edges, and outputs per-vertex saliency values. To capture perceptually relevant geometry, we enrich the input features with discrete mean curvature alongside 3D coordinates. The model consists of three FeaStConv layers, each followed by a residual connection that stabilizes training and mitigates oversmoothing. We evaluate SARMA on the Schelling dataset of 3D models annotated with human saliency points, using PLCC and AUC as evaluation metrics. Our approach outperforms prior handcrafted and deep learning-based methods in terms of correlation with ground truth saliency, demonstrating the effectiveness of residual graph-based architectures for perceptual analysis of 3D shapes.
@inproceedings {cin-Lezoray-2025-2,
year = {2025},
title = {Saliency Prediction on 3D Meshes Using Residual FeaStConv-Based Graph Neural Networks},
editor = {IEEE},
booktitle = {Visual Communications and Image Processing (VCIP)},
author = {O. Lézoray and Z. Ibork and A. Nouri and C. Charrier} }