Roland Fleming

Learning to See Stuff: Modelling Human Perception with Unsupervised Deep Learning

Humans are very good at visually recognizing materials and inferring their properties. Without touching surfaces, we can usually tell what they would feel like, and we enjoy vivid visual intuitions about how they typically behave. This is impressive because the retinal image that the visual system receives as input is the result of complex interactions between many physical processes. Somehow the brain has to disentangle these different factors. I will present some work in which we show that an unsupervised neural network trained on images of surfaces spontaneously learns to disentangle reflectance, lighting and shape.  We find that the network not only predicts the broad successes of human gloss perception, but also the specific pattern of errors that humans exhibit on an image-by-image basis.  I will argue this has important implications for thinking about vision more broadly.


Roland Fleming studied at Oxford and MIT, and did a postdoc at the Max Planck Institute for Biological Cybernetics. Since 2010, he has been the Kurt Koffka Professor of Experimental Psychology, at Giessen University, and also served as the Executive Director of the Center for Mind, Brain and Behaviour of the Universities of Marburg and Giessen.  His research combines psychophysics, neural modelling, computer graphics and image analysis to understand how the brain estimates the physical properties of objects and materials. He coordinated the EU-funded Marie Curie Training Network “PRISM: Perceptual Representation of Illumination, Shape and Materials”. In 2013 he was awarded the Young Investigator Award by the Vision Sciences Society, and in 2016 an ERC Consolidator Grant for the project “SHAPE: On the perception of growth, form and process”.  In 2022 he was elected Fellow of the Royal Society of Biology.