Updated: fMRI Based Visual Stimulus Reconstruction

11 12 2008

A simple view of what the brain does is acquire input, process it, then produce output. One strategy for understanding what processing takes place is to record the patterns of brain activity while showing many patterns of input, then see if you can use the information gained to predict a novel input, given the pattern of brain activity. The canonical example of this approach is visual input reconstruction based on recorded spike trains in the visual system of the blowfly.

The blowfly is a relatively simple system (though quite efficient) with a tiny brain. Could a similar approach work in humans?  Although we can’t drop electrodes into the visual cortex (usually), we can put people in fMRI scanners to visualize the pattern of blood oxygenation, which is correlated with neural activity.

In Visual Image Reconstruction from Human Brain Activity using a Combination of Multiscale Local Image Decoders, Miyawaki et al demonstrate visual input prediction using fMRI responses. Using 3mm3 voxels, the group measured the activity level across early visual cortex (V1-V4) for numerous 10×10 binary patterns of visual stimuli. They looked at correlations in 1×1, 1×2, 2×1 and 2×2 bins of voxel activity to hundreds of visual test patterns. The activity represented local image elements. Then they displayed novel visual input and used a linear combination of the local image element responses to predict the visual input from the brain activity alone. It is noteworthy that they only required several hundred training images before visual input prediction was possible.

Predicted visual input from fMRI activity in V1 and V2

Predicted visual input from fMRI activity in V1 and V2

Note that a rentinotopic map, where the relative spatial position of visual input is reflected in the activity across the visual cortex, is not strictly required for this technique to work. What is required is that the response of each local element is consistent across similar patterns of input in the element’s receptive field. Furthermore, the spatial scale of pattern representation in early processing regions of human visual cortex is broad enough to be picked up by the fMRI scanner.

It would be interesting to see how much higher visual resolution could be predicted with an fMRI approach. Could this approach be adapted to predict input from the responses of cells with more complex receptive fields in higher cortical areas? Or, are those cells too intermingled with neighbors with vastly different response properties to be separable by fMRI?  Higher areas are vital for our own brains to rapidly perceive the contours of complex images. I’d also like to see how well non-contiguous images are predicted.

Cellular resolution calcium imaging with bulk loaded dyes has been used to map fine-grained detail of receptive fields in lower animals visual and somatosensory cortex. Is input prediction possible from these recordings? Is the input training set too limited? Could more complex input be perceived using a fewer number of complex cells from higher visual areas (V2 and above)?