Behaviour from Head Pose
The aim of the project is to automatically identify the direction in which people are facing from a distant camera in a surveillance situation to provide input to higher level reasoning systems. The direction in which somebody is facing provides a good estimate of their gaze direction, which can be used to infer familiarity between people or interest in surroundings. It can be seen as closing the gap between a coarse description of humans from a distance and a more detailed motion of limbs, usually obtained from a closer view. The work is partly funded by HERMES, located in work package 3 and 4.
Active Scene Exploration
Effective use of resources is an underlying theme of this project. The resources in question are a set of cameras which overlook a common area from varying viewing angles. These cameras are heterogenous and have different parameters for control, e.g. some are static, some are pan, tilt and zoom cameras. Information theoretic measures are used to choose the best surveillance parameters for these cameras, whereas best can be defined by higher level reasoning, or human operators. Currently, the work concentrates on objective functions from information-theory and the use of sensor data fusion techniques to make informed decisions.
As part of the HERMES project, the goal is to establish a perception/action cycle with specific consideration of varying zoom levels. The distributed camera system can be interpreted as an abstract sensor which is content with higher level objectives as input.
The coarsest scale of an agent representation is considered to track agents and note their trajectories, together with other coarse scale features, that will be useful for action and intention recognition. The aim is then to generate behaviour and conceptual descriptions about the agent itself and its relationship with respect to other agents and predefined objects in the scene.
Cognitive Computer Vision
Recent work in visual tracking and camera control has looked at the issues involved in activity recognition using parametric and non-parametric belief propagation in Bayesian Networks, and begun to touch on the issues of causality. The current research takes all of these areas forward. The ultimate goal will be to combine these techniques to produce a pan/tilt/zoom camera system, and/or network of cameras, that can allocate attention in an intelligent fashion via an understanding of the scene, inferred automatically from visual data.
The topic is directly related to the EU project HERMES, which is in the exciting and socially relevant area of intelligent visual surveillance. The aim of the research is to develop cameras systems that could be considered to exhibit emergent cognitive behaviour, through developing algorithms and ontologies for understanding of visual scenes.
The video compares the monocular SLAM system running with and without object detection in a spit-screen view. The system without the object detection looses track due to insufficient features, and at this point the video is slowed down to highlight this. The system with the object detection continues and at the end of the video it has successfully detected all five objects and accurately localized them in the world.