Few-shot event-based action recognition

Zanxi Ruan; Nan Pu; Jiangming Chen; Songqun Gao; Yanming Guo; Qiuyu Kong; Yuxiang Xie; Yingmei Wei

doi:10.1016/j.neunet.2025.107750

Few-shot event-based action recognition

Neural Netw. 2025 Jun 21:191:107750. doi: 10.1016/j.neunet.2025.107750. Online ahead of print.

Authors

Zanxi Ruan¹, Nan Pu², Jiangming Chen³, Songqun Gao⁴, Yanming Guo¹, Qiuyu Kong¹, Yuxiang Xie¹, Yingmei Wei⁵

Affiliations

¹ Laboratory for Big Data and Decision, National University of Defense Technology, China.
² University of Trento, Italy.
³ Laboratory for Big Data and Decision, National University of Defense Technology, China. Electronic address: jiangming_chen@nudt.edu.cn.
⁴ Chinese University of Hong Kong, Hong Kong.
⁵ Laboratory for Big Data and Decision, National University of Defense Technology, China. Electronic address: weiyingmei@nudt.edu.cn.

PMID: 40578212
DOI: 10.1016/j.neunet.2025.107750

Abstract

Despite the evident superiority of event cameras in practical vision applications (e.g., action recognition), owing to their distinctive sensing mechanism, existing event-based action recognition methods rely heavily on large-scale training data. However, the expensive cost of camera deployment and the requirement of data privacy protection make it challenging to collect substantial data in real-world scenarios. To address this limitation, we explore a novel yet practical task, Few-Shot Event-Based Action Recognition (FSEAR), which aims at leveraging a minimal number of intractable event action data for model training and accurately classifying unlabeled data into a specific category. Accordingly, we design a new framework for FSEAR, including a Noise-Aware Event Encoder (NAE) and a Distilled Prototypical Distance Fusion (DPDF). The former efficiently filters noise within the spatiotemporal domain while retaining vital information related to action timing. The latter conducts multi-scale measurements across geometric, directional, and distributional dimensions. These two modules benefit mutually and thus effectively exploit the potential characteristics of event data. Extensive experiments on four distinct event action recognition datasets have demonstrated the significant advantages of our model over other few-shot learning methods. Our code and models will be publicly released.

Keywords: Action recognition; Event camera; Few-shot learning.