Find in Library
Search millions of books, articles, and more
Indexed Open Access Databases
TransAnomaly: Video Anomaly Detection Using Video Vision Transformer
oleh: Hongchun Yuan, Zhenyu Cai, Hui Zhou, Yue Wang, Xiangzhi Chen
Format: | Article |
---|---|
Diterbitkan: | IEEE 2021-01-01 |
Deskripsi
Video anomaly detection is challenging because abnormal events are unbounded, rare, equivocal, irregular in real scenes. In recent years, transformers have demonstrated powerful modelling abilities for sequence data. Thus, we attempt to apply transformers to video anomaly detection. In this paper, we propose a prediction-based video anomaly detection approach named TransAnomaly. Our model combines the U-Net and the Video Vision Transformer (ViViT) to capture richer temporal information and more global contexts. To make full use of the ViViT for the prediction, we modified the ViViT to make it capable of video prediction. Experiments on benchmark datasets show that the addition of the transformer module improves the anomaly detection performance. In addition, we calculate regularity scores with sliding windows and evaluate the impact of different window sizes and strides. With proper settings, our model outperforms other state-of-the-art prediction-based video anomaly detection approaches. Furthermore, our model can perform anomaly localization by tracking the location of patches with lower regularity scores.