Abstract
Abstract:
It is clear that optic flow is useful to guide an observer's movement and that binocular disparity contributes too (e.g. Roy, Komatsu and Wurtz, 1992). Both cues are important in recovering scene structure. What is less clear is how the information might be useful after a few seconds, when the observer has moved to a new vantage point and the egocentric frame in which the information was gathered is no longer applicable. One answer pursued successfully in computer vision, is to interpret any new binocular disparity and optic flow information in relation to a 3D reconstruction of the scene (Simultaneous Localization and Mapping, SLAM). Then, as the estimate of the camera pose is updated, the 3D information computed from earlier frames is always relevant. No-one suggests that animals carry out visual SLAM, at least not in the way that computer vision implements it and yet we have no serious competitor models.
Reinforcement learning is just beginning to approach 3D tasks such as navigation and to build representations that are quite unlike a 3D reconstruction. I will describe psychophysical tasks from our VR lab where participants point to unseen targets after navigating to different locations. There are large systematic biases in performance on these tasks that rule out (in line with other evidence) the notion that humans build a stable 3D reconstruction of the scene that is independent of the task at hand. I will discuss some indications about what it might do instead.
https://www.reading.ac.uk/psychology/about/staff/a-glennerster.aspx