Attention on Films
Overview
How do visuospatial features guide collective attention in films?
๐
Jan 2020 โ Dec 2021
๐๏ธ University of Skรถvde, Sweden; รrebro University, Sweden; German Aerospace Center (DLR)
๐ฅ Collaborators: Mehul Bhatt, Jakob Suchan, Paul Hemeren
This research investigated how visuospatial attributes in narrative films direct viewers’ attention, using eye-tracking (32 participants/scene) and a novel visuospatial event model for semantic annotation. We analyzed 10 film scenes from the Moving Images dataset to correlate attentional measures with multimodal scene features.
Aim
To establish a method for semantically grounding events and examine how visuospatial cues (gaze, motion, spatial relations) influence:
- Attentional measures including synchorny (shared gaze patterns)
- Predictive (anticipatory) viewing behaviors
Methodology
- Dataset: 10 scenes from acclaimed films (Solaris, Goodfellas, Grand Budapest Hotel etc.)
- Participants: 32 viewers per scene tracked at 60Hz using Tobii X2-60
- Annotation Framework: Expert annotations using ELAN tool with controlled vocabulary for:
- Scene Elements: Objects, regions, body parts
- Scene Structure: Visibility, motion, spatial relations, gaze, actions
- Visual Attention: Fixations/saccades (low-level) + object-level attention
- Analysis Metrics:
- Attentional synchrony: % viewers fixating same region/body part
- Feature distribution in high/low synchrony segments
Visuospatial Event Model
A formal taxonomy for semantic interpretation of scenes. See example below:
Category | Features |
---|---|
Scene Elements | Objects, regions, body parts |
Scene Structure | visible(X) , moving_towards(X,Y) , gazing_at(X,Y) |
Visual Attention | attention_on(face(X)) |
Results
-
High synchrony correlates with:
- Isolated characters in frame
- Specific behavioral cues (e.g., sudden gaze shifts)
- Visibility changes (e.g., occlusions)
-
Low synchrony segments contained 32% more event data (scene structure annotations), suggesting richer interpretative possibilities when attention diverges
-
Key attention drivers:
- Head movements (predictive of upcoming actions)
- Gaze transitions between agents
- Hand actions with referential significance
Applications
- AI Systems: Grounding for human-activity recognition
- Directing Practices: Quantifying attention-guiding techniques
- Extended Reality: Predictive gaze models for VR narrative
- Cognitive Modeling: Benchmark for human-like event understanding
- Attention prediction models using visuospatial features
Project Outcomes
๐ See Publication
Collaboration Opportunities
Open to collaboration or discussion on methodology, data, or future directions. Happy to exchange ideas and explore new perspectives.