Saliency Prediction on 3D Meshes Using Residual FeaStConv-Based Graph Neural Networks

Abstract

We propose SARMA (Saliency Analysis with a Residual Mesh-based Architecture), a graph neural network designed to predict visual saliency on 3D surface meshes. Unlike traditional methods that rely on handcrafted geometric features, SARMA learns saliency patterns in an end-to-end fashion using residual Feature-Steered Graph Convolution (FeaStConv) layers. The network takes a mesh as input, represented as a graph of vertices and edges, and outputs per-vertex saliency values. To capture perceptually relevant geometry, we enrich the input features with discrete mean curvature alongside 3D coordinates. The model consists of three FeaStConv layers, each followed by a residual connection that stabilizes training and mitigates oversmoothing. We evaluate SARMA on the Schelling dataset of 3D models annotated with human saliency points, using PLCC and AUC as evaluation metrics. Our approach outperforms prior handcrafted and deep learning-based methods in terms of correlation with ground truth saliency, demonstrating the effectiveness of residual graph-based architectures for perceptual analysis of 3D shapes.

Principle

Comparison

Comparison. We compare the results of our model (right) with the Ground Truth of the Schelling dataset (left).

Interactable Meshes with the predicted saliency

BibTeX

@inproceedings {cin-Lezoray-2025-2,
year = {2025},
title = {Saliency Prediction on 3D Meshes Using Residual FeaStConv-Based Graph Neural Networks},
editor = {IEEE},
booktitle = {Visual Communications and Image Processing (VCIP)},
author = {O. Lézoray and Z. Ibork and A. Nouri and C. Charrier} }