Scientific objectives
Technological objectives
  • Autonomous audio/video data stream modeling
  • Human behavior analysis from sensory-data
    • Real-time monitoring of individual, group and crowd/flow
    • Collective behavior modeling with on-line adaptation
  • Deployment of an innovative platform for audio/video infrastructure management
  • Technological/scientific assessment on two complementary metro infrastructures



Long-term statistics building
for planning applications

Automatic sensor selection
for videowall management

Human-centered monitoring
using audio/video analysis

Current situation
Current situation
Current situation
  • Transportation terminals subject to capacity issues
    • Need expressed by managers for analysis of passenger dynamics
    • Bottleneck is high variety/complexity of passenger behaviours
  • CCTV video streams never watched (e.g. in Turin, 28 monitors for 800 cameras).
    • Monitors show empty scenes/spaces, while others cameras look at scenes in which something (even normal) is happening
    • Probability to watch right streams at right time is very limited
  • Human behaviour modelling not ready for real-scale environment
    • Scene understanding based on location features not sufficiently reliable
    • Need for robust human-centred features
VANAHEIM proposal
VANAHEIM proposal
VANAHEIM proposal
  • System able to identify and characterize structures inherent in collective behavior
    • Continuous monitoring of user information
      • locations, routes,
      • spatio-temporal activities (walking, waiting...),
      • interactions with others passengers and/or equipments,
      • contextual data (time of day, density of people...)

  • Goal: estimate trends of large-scale human behaviour at an infrastructure level, e.g. to
    • Localize common loitering areas and/or highly frequented aisles,
    • Identify traffic patterns in infrastructure,
    • ...
  • Mechanisms for selecting relevant audio/video streams in control rooms
  • Models to characterise video streams content
    • Trivial scenario when dealing with “empty vs occupied” scenes
    • Challenging problem when almost all scenes are occupied
  • Need for unsupervised modelling is even more explicit for audio streams (“mosaicing” of data is impossible due to transparent nature of sound)

  • Goal: Development of autonomous content-based audio/video sensor selection system for control rooms
  • Investigate 3 levels of human behaviours characterization in surveillance data

    • Individual level
      characterize an individual person with his/her activities

    • Group level
      detect small group of people and identify interactions in it

    • Crowd level
      monitor (dynamics of) crowd/flow of people
  • Goal: Two applications
    • Event detection applications for safety/security
    • Environmental reporting for situational awareness