Visual Attention Mapping

Visual Attention Mapping

Overview

How Naturalistic Visuo-Auditory Cues Guide Human Attention
📅 Sep 2022 – Jan 2024
🏛️ University of Skövde, Sweden; Örebro University, Sweden; Constructor University, Germany
👥 Collaborators: Paul Hemeren, Erik Billing, Mehul Bhatt, Jakob Suchan

This project systematically investigated how visuospatial and auditory cues guide visual attention during passive observation of naturalistic human interactions. Building on cognitive vision principles, we developed a novel event model to analyze cue-attention dynamics in film-like narrative scenarios.


Aim

To characterize how five core visuoauditory cues modulate attention:

  • Speaking (vocal turn-taking patterns)
  • Gaze direction (reference/mutual/transition)
  • Relative motion (directional vs. non-directional)
  • Hand actions (pointing/gesturing/reaching)
  • Visibility dynamics (entry/exit scenes)

Focused on quantifying both unimodal (individual) and multimodal (combined) cueing effects on attentional flow.


Methods

Structured Event Modeling Approach

  • Developed a formal visuoauditory event model with three components:
    • Scene elements (objects/regions)
    • Scene structure (interaction modalities)
    • Visual attention (eye-tracking metrics)
  • Coded 27 narrative-driven interaction scenarios using ELAN annotation framework

Experimental Design

  • Collected eye-tracking data from 90 participants (3 groups × 30)
  • Between-subject design: Each group viewed one version of 9 scene contexts
  • Eye movement metrics: Fixation duration, saccade magnitude, object-level attention

Stimuli & Data

Custom Naturalistic Stimuli Set

  • 27 film-like scenarios depicting everyday interactions
    • 9 core scene contexts × 3 variations (e.g., differing gaze/hand action patterns)
    • Featured intentional actions aligned with narrative goals
  • Public dataset: Annotations, eye-tracking data, and stimuli available for open science

Key Findings

  • Speaking: Turn-taking triggered anticipatory gaze shifts to next speaker
  • Gaze: Observers reliably followed gaze direction (attention shift to referenced agent)
  • Hand Actions: Pointing/gesturing drew more attention than grasping/holding
  • Visibility: Entry events captured immediate attention; exits triggered anticipatory shifts to remaining agents
  • Motion + hand actions created synergistic cueing effects
  • Speaking dominated attention when combined with other cues, while Gaze redirected attention toward referenced agents

Future Directions

  • Investigate context-locked cues in narrative variations
  • Model anticipatory gaze in event prediction
  • Extend framework to active participation scenarios
  • Develop probabilistic cue-interaction models

Project Outcomes


Collaboration Opportunities

Open to collaboration or discussion on methodology, dataset, or future directions. Happy to exchange ideas and explore new perspectives.