0:00

We're going to finish up today's lecture with a discussion of reconstruction.

Â I think we all share the dream that one day we might be able to record brain

Â activity, during our sleep, and in the morning play back our dreams.

Â So while the dream state is still not well understood, how close are we from

Â being able to reconstruct even awake sensory experience from neural activity.

Â We can apply the methods that we've discussed in the past two lectures to

Â think about how to do that. So in the previous parts of this lecture,

Â we talked about methods to find an estimate of the stimulus using Bayesian

Â decoding. So now we'd like to extend our decoding

Â procedures to the case of the responses and the stimuli.

Â At varying continuously in time. Let's go through a simple example of a

Â decoding strategy that meshes with the problem set that you have been working

Â on. Some of you, anyways.

Â Let's say that we wanted to find an estimator, S Bayes, that gives us the

Â best possible estimate of our stimulus, S, given that we've observed some

Â response, R. So, how should we compute s bayes?

Â So one startegy that makes sense is to ask for an estimator that is only average

Â as close as possible, to our stimulus. So I'm going to intriduce some error

Â function which we'll call L and then minimize this error.

Â Averaged over all possible stiulus choices that are consistent with our

Â response R. So now, we need to choose a form for this

Â error function, a very natural choice. Is the main square error.

Â We'll take L to be just the mean squared difference between our estimator and the

Â true stimulus squared. Now to derive an expression for S bayes

Â that solves this problem, we need to minimize the average error.

Â So remember how we minimize a function. We take the derivative of that function

Â which respect to the parameter that we're interested in.

Â So here as base and we set as equal to 0. So lets just do that calculation.

Â So now we want to take d by ds, this call SB integral DS.

Â Now let's substitute in our expression for our error squared.

Â Probability of s given i. So now we take the derivative of that

Â with respect to s b. So that's going to be equal to interval d

Â s. The only time that depends on SB is this

Â one, so the derivative of a square is just S minus SB times 2 times the

Â probability of S given R. And now we set that equal to zero.

Â So hopefully you can see that the solution is of this form.

Â So how did we get that, let's just write that out.

Â So we have integral DS, we can separate these two terms out and put them on two

Â sides, so and S P of S given R, is going to be equal to integral ds, s Bayes,

Â probability of s given r. And now, if we integrate the probability

Â of s given r, over s, sB here is just a constant and will come out.

Â 3:13

Now, the integral over this probability distribution, since the probability

Â distribution is normalized, it's just going to be equal to sB.

Â And so here's our solution. So, we have sB is equal to this

Â expression, which we already, already have here.

Â Now, I want you to take a look at that for a moment and see if you recognize it.

Â So, what if our response is just a single spike?

Â So what does this expression amount to? Well it does the spike triggered average,

Â right, it's the stimulus triggered by the response from the spikes.

Â So we're going to take all the stimuli, wait them by this probability that they

Â occurred in response to a spike and average over them all.

Â 3:57

So how do we apply this to reconstructing a simple stimulus?

Â So imagine that this is our spike triggered average.

Â So now every time there's a spike, so that our measured spike train, we're

Â going to paste in the spike trigger that our, our conditional average.

Â So at low firing rates, this is not looking very good.

Â But at higher firing rates, so now you see that we're getting closer and closer

Â to a, a smoothly bearing function. You might've realized already that there

Â are some issues with a filter or feature of, of this exponential form as we drew

Â before. Which is that it can never capture a

Â negative fluctuation in the input. This is actually an issue with the fly

Â neuron data that you've looked at in the problem set.

Â The fly has two h1 neurons. One that encodes leftward motion and

Â another that encodes rightward motion. So if you've tried to construct a,

Â reconstruct a velocity stimulus with only one of your H one neurons, you'll only

Â ever be able to recover either leftward or rightward motions.

Â 4:57

In the book Spikes which is a very nice exposition of this kind of

Â reconstruction, at considerably more depth than I can give here, the authors

Â actually simulate the other H1 neuron by playing the original stimulus, but with

Â the opposite sign. And that now gives us enough information

Â to reconstruct both positive and negative inputs.

Â So now let's see this kind of decoding in action.

Â The movie you're about to see is based on the activity of multiple neurons in the

Â lateral denucleate nucleus of the cat. This is work by Yang Dan of Berkeley done

Â about 15 years ago when she was a PhD student.

Â By convolving the spike trains from multiple neurons in LGN with the spatial

Â temporal receptive fields of those neurons Activity allows a noisy, but

Â comprehensible reconstruction of the scene.

Â So in this case, it's a cat being recorded now from, while, while

Â anesthetized. But the LGN neurons are giving a pretty

Â good reconstruction of what the cat is looking at.

Â Hopefully you can see Yang's advisor looming into view.

Â That's Joe Attic whose work applying information theory, to understand

Â receptor field structure, is coming up next week.

Â 6:13

Now let's forward a few years to 2011. I'm going to finish up this week lecture

Â with a rather impressive example of decoding, that starts to get us closer to

Â that mind reading fantasy. It also neatly brings together the idea

Â we covered last week and this week. In this set of experiments, Jack Gallant

Â also at Berkeley and his colleagues recorded from visual cortex of humans

Â viewing movies using FMRI. And used the recordings to reconstruct

Â one second long movie sequences. So here you're seeing reconstructions of

Â single scenes. But these are in fact stills from one

Â second long movies. Nonetheless I hope you get a sense of how

Â impressive these reconstructions are. So how did they do this?

Â So here's the basic idea. Going back to this model that we've used

Â over and over again. The researchers here are trying to find a

Â movie clip s that maximizes this a posteriori distribution.

Â So they use a library of 18 million clips and take the prior p of s to be uniform

Â across those samples. So what's missing is the likelihood.

Â To compute the likelihood of a given clip from the database, they develop an

Â encoding model that they fitted from a different training set of movies, so that

Â they can evaluate the predicted response for an arbitrary input.

Â Then they can evaluate this likelihood measure by computing how well the

Â predicted response to a movie from the library matches the true response.

Â Let's take a peek at the encoding model, as it uses several of the ideas that we

Â developed last week. So here's the model that predict

Â responses. As you might recall, we mentioned last

Â week that fMRI relies on blood oxygenation or, or BOLD signals, so it

Â has a slower response time than in neuro-, than neural activity.

Â So in this model the neural response despite is separated from the bold signal

Â separately fit how is the neural response model.

Â Let's zoom in, its predicted as in the models of last week by first filtering

Â the input. Here's a couple of different filtering

Â stages to filter the input to extract certain features.

Â Let's focus on this part in which the image is filtered through a pair of

Â oriented filters here. At different phases, just as we described

Â last week for complex cell responses in V1.

Â Now the outputs of those two filters are squared and summed.

Â This means that one gets a large response independent of spatial phase, as we also

Â mentioned last week. Then the output of that filtering stage

Â Is passed through a compressive non-linearity.

Â In this case the function is taken to be a log function.

Â 8:51

And then this is temporally down sampled. That is, it's smoothed from a 15 hertz

Â signal to a one hertz signal in order to reduce noise.

Â That's taken to be the predicted neural response.

Â This newer response is then passed through an additional filter that

Â accounts for the slow response of the, of the blood oxygenation level.

Â So now here's the full procedure. An encoding model, like we just saw, is

Â fitted for each voxel, each volume unit in the brain region being image/g.

Â And then that model is used to predict the response to the millions of images in

Â the database. The stimuli with the highest likelihood,

Â which in this case, is equivalent to those with the highest poterior and those

Â best account for the predicted respionse. So here are the predicted responses, in

Â this column. The map solution, the most likely

Â solution or the highest a posteriori solution would be to simply read off the

Â maximum value. Because the clips are full of highly

Â specific detail, one can in this case do a lot better by averaging those out.

Â So what they're going to do is to rank these images by the degree to which their

Â pricted, predicted responses fit the true response.

Â And take the top sequence of images that have the highest match.

Â So here they're drawing the top 30 highest posterior clips.

Â So the, the 30 clips that have the highest degree of match to the predicted

Â response. So one could simply take this best value

Â but because of all of that, because of all the specific detail in these sample

Â images from there prior, one doesn't look better by combining them.

Â So now if you look at the cumulative average of many of these high probability

Â clips, then what you see is that as one gets To larger and larger numbers of

Â them, you're getting quite a good match. So remember this is one of these

Â examples. Perhaps this one.

Â So in this case, you can see the effect of that averaging.

Â So now you no longer see a crisp a crisp image that you would get from a single

Â choice. From your prior distribution.

Â Instead, you average over many of them. But now, what that does is to remove

Â specific features, and give you a general gestalt that's much more similar to to

Â the stimulus that was presented. So I hope this demonstrates that we are

Â within reach of that dream. That we will be able to look at neural

Â activity, and using clever models of the type that I showed you just now, we'll be

Â able to reconstruct naturlistic images from that neural activity So that brings

Â us to the end of my lecture for this week.

Â You'll also find online a special guest lecture by my colleague Fred Rieke, a

Â world-acknowledged wizard of retinal processing.

Â Next week we'll be moving on to a consideration of information: how is

Â information defined? What exactly does it quantify, and how

Â can it be useful in neuroscience? I hope you've enjoyed this week and that

Â we'll see you back next week.

Â