Attention on Films | Vipul Nair

Projects

Attention on Films

Overview

How do visuospatial features guide collective attention in films?
📅 Jan 2020 – Dec 2021
🏛️ University of Skövde, Sweden; Örebro University, Sweden; German Aerospace Center (DLR)
👥 Collaborators: Mehul Bhatt, Jakob Suchan, Paul Hemeren

This research investigated how visuospatial attributes in narrative films direct viewers’ attention, using eye-tracking (32 participants/scene) and a novel visuospatial event model for semantic annotation. We analyzed 10 film scenes from the Moving Images dataset to correlate attentional measures with multimodal scene features.

Aim

To establish a method for semantically grounding events and examine how visuospatial cues (gaze, motion, spatial relations) influence:

Attentional measures including synchorny (shared gaze patterns)
Predictive (anticipatory) viewing behaviors

Methodology

Dataset: 10 scenes from acclaimed films (Solaris, Goodfellas, Grand Budapest Hotel etc.)
Participants: 32 viewers per scene tracked at 60Hz using Tobii X2-60
Annotation Framework: Expert annotations using ELAN tool with controlled vocabulary for:
- Scene Elements: Objects, regions, body parts
- Scene Structure: Visibility, motion, spatial relations, gaze, actions
- Visual Attention: Fixations/saccades (low-level) + object-level attention
Analysis Metrics:
- Attentional synchrony: % viewers fixating same region/body part
- Feature distribution in high/low synchrony segments

Visuospatial Event Model

A formal taxonomy for semantic interpretation of scenes. See example below:

Category	Features
Scene Elements	Objects, regions, body parts
Scene Structure	`visible(X)`, `moving_towards(X,Y)`, `gazing_at(X,Y)`
Visual Attention	`attention_on(face(X))`

Results

High synchrony correlates with:
- Isolated characters in frame
- Specific behavioral cues (e.g., sudden gaze shifts)
- Visibility changes (e.g., occlusions)
Low synchrony segments contained 32% more event data (scene structure annotations), suggesting richer interpretative possibilities when attention diverges
Key attention drivers:
- Head movements (predictive of upcoming actions)
- Gaze transitions between agents
- Hand actions with referential significance

Applications

AI Systems: Grounding for human-activity recognition
Directing Practices: Quantifying attention-guiding techniques
Extended Reality: Predictive gaze models for VR narrative
Cognitive Modeling: Benchmark for human-like event understanding
Attention prediction models using visuospatial features

Project Outcomes

📚 See Publication

Collaboration Opportunities

Open to collaboration or discussion on methodology, data, or future directions. Happy to exchange ideas and explore new perspectives.

Last updated on 2021-12-31

← Scientific Filmmaking May 31, 2022

Event Annotation Platform January 1, 2021 →