Mobile QR Code QR CODE
Title Vision-based Multi-task Hybrid Model for Teacher-Student Behavior Recognition in Classroom Environment
Authors (Huan Zhou) ; (Wenrui Zhu)
DOI https://doi.org/10.5573/IEIESPC.2024.13.6.587
Page pp.587-597
ISSN 2287-5255
Keywords Classroom behavior; Dual-stream framework; Multi-task hybrid model; Multi-mode learning; Spatio-temporal graph convolutional network
Abstract Teacher-student concentration in the teaching process is an essential indicator for evaluating teaching quality. Many researches assess students' learning interests by identifying their classroom behaviors but ignore the influence of teachers' behavior on students' behavior. Therefore, we collect classroom video data of teacher and student perspectives to analyse the interplay between their behaviors. Considering the particularity of data collection in classroom environments, we design a vision-based multi-task hybrid model for multi-mode data (RGB, optical flow and skeleton data). This model structure is divided into two parts. The RGB and optical flow are input into a spatio-temporal dual-stream framework for real-time action localization of the teacher. This dual-stream framework includes a 2D-CNN branch to extract spatial information and a Vision Transformer (ViT) branch to extract temporal information. In another part, skeleton data is obtained through the pose estimation method, and we propose a multi-level stacked spatio-temporal graph convolutional network (MSSTGCN) for skeleton-based student behavior recognition. This network can process the multi-order semantic information of the skeleton data and fuse the features at different scales through the Non-local block.