0:00
[BLANK_AUDIO]. In the last video we talked about
classical conditioning. Today, we're going to talk about Operant
Conditioning. And we're going to talk about two sort of
developments in the understanding of Operant Conditioning.
We're going to talk about the Law of Effect that was talked about by
Thorndike. And we're going to talk about Operant
Conditioning as further developed by B.F. Skinner.
Now, in classical conditioning, the main thing that was learned was a new
association. A conditioned stimulus now produces a
conditioned response, where it prior was neutral.
Here we're talking about not actually learning new behaviors with Operant
Conditioning, learning a behavior that wasn't in the repertoire behaviors prior
to the conditioning. And Thorndike's case, he actually put
cats in these boxes, that if they pulled levers and turned knobs and did kind of
things, they would open the door to the, to the puzzle box and there would be
food, food that the cat couldn't reach without opening the door to the puzzle
box. And the cat was put into the box, and it,
obviously, it explores things, it doesn't just sit there.
Then, it explores and touches things and tries to get to the food.
And eventually, through trial and error, it'll actually open the, put his paw and
open that, that switch and pull the right string and the door will open and the,
and the cat gets the food. And here we have the time that the cat
spends in the box across trials, putting in the cat in the box.
And as you can see, eventually, the cat stays in the box for a very short period
of time because it opens a door and gets it's actually has learned what it takes
to open a door to get to the food. Now, he called this The Law of Effect.
If behaviors are developed that produce pleasurable outcomes, then, the effect of
that pleasurable outcome will strengthen that behavior.
The behavior of opening the, the door to the box by turning the switch and pulling
the string. The Law of Effect.
This was further developed by Skinner in using, what's often called a Skinner box.
He called it the operant chamber. An animal was placed in this box and it
has several features to it. It has a lever that the animal can press.
Press this lever, and if, if we're conditioning the pressing of that lever
then that would lead to food being released in the food dispenser.
Animal presses the lever, gets food. And they also, things in the box that can
produce other kinds of stimuli. There's a loudspeaker to produce sounds.
There are lights of different colors to, to use as stimulus control to produce the
response only when the stimulus is present.
And there's electrified grid on the floor of the box so the shock can be
administered if you wanted to use punishment.
So, if I was just training the lever, the animal again, just like the cat in the
box, is not going to just sit there. It's going to wander around, touching
things, sniffing things. And eventually, it hits its paw, its nose
against the lever, and the lever dispenses food in the food dispenser.
And of course, then, we can actually watch, just watching that, the animal
will start pressing the lever to get to the food.
we might want to say, we might want to have discrimination.
So, we'll only provide food when it presses the lever when the green light is
on. When the red light is on the lever will
not lead to. So eventually, we'll have stimulus
control and when the green light comes on, then the animal will press the lever
and get the food. Operant Conditioning.
There's two things we can do. We can reinforce the animal or the
person. That's a stimulus that actually
strengthens behavior. Like the food for the rat chocolate for
me, something that you find pleasurable, will strengthen that behavior.
The Law of Effect. We can also have, we can also weaken a
particular behavior by using punishment. And punishment is a stimulus, a negative
stimulus that weakens behavior. Pun-, reinforcement, strengthening
behavior, punishment weakening of behavior.
So, reinforcement strengthens behavior and have positive reinforcement and
negative reinforcement. Positive reinforcement is where I present
a pleasant stimulus after the behavior occurs.
The rat presses the lever, it gets food. Negative reinforcement is when unpleasant
stimulus is removed after the behavior occurs.
Again, with strengthening behavior like pressing the lever, but now if I press
the lever then some unpleasant stimulus is removed.
For example, if the animal is being shocked and now it presses the lever, and
the shock is removed, then I'm strengthening the behavior of pressing
the lever. So, positive reinforcement, negative
reinforcement. I can also use punishment that weakens
the behavior. And I can have positive punishment, where
a presentation of unpleasant stimulus after the abhorrent behavior occurs.
So, if I press the level and then I'm shocked, then it's going to weaken the
behavior of pressing the lever. And I can also have negative punishment
and it's where I can remove a pleasant stimulus after the operant behavior
occurs. So, if in fact I'm receiving a pleasant
stimulus and now it's removed after I engage in the behavior, then that
behavior's going to be weakened. I like to think of my grandchildren
playing in my house and they're having fun playing but they're also being very,
very loud. And I might say, Okay, time out.
You're being too loud. Sit over here for five minutes in the
corner. Time out.
So, I'm removing a pleasant stimulus of playing with their, their their cousins
and by removing that pleasant stimulus then I just hope to reduce the noise
that's made while they're playing. It works sometimes but it, it's negative
punishment. Now, I also with operant condition, can
engage in very, very complex behaviors. You've seen animals do tricks at
circuses. Very complex behaviors can really be
controlled by using operant conditioning. But I have to do it by, it's, sometimes
these behaviors, like a rat doing a backflip, are not going to occur
naturally. And so, I have to somehow figure out how
to do that by shaping the an-, shaping the behavior.
The method of successive approximations. So, I might be I might use shaping.
Let's say I want to make the animal turn around completely, I might pre-, give the
animal food only when it turns away from the lever.
Then, it turns further away from the lever towards the back of the cage, then
I have to give it reinforcement. And if it turns around, so that it's
turned around far enough, so that now when I give it food, it completes the
circle. Eventually, I'll have the animals
circling the box to get the food. I'm shaping a complex behavior by using
successive approximations of the behavior I want to shape.
Now, I'd like you to Google rats playing basketball on your computer, and you'll
see some films of complex behavior. Rats actually going down the court In
shooting hoops that had been shaped by really classroom kids learning about
operant conditioning. So Google rats playing basketball and
watch those films. It's a lot of fun.
Just want to mention one other thing about operant conditioning.
Olds found that, if you put an electrode down in the hypothalamus of the brain, an
area below the cerebral cortex, the hypothalamus.
The hypothalamus is known to control motivation, feeding behavior, drinking
behavior pleasure, pleasurable kinds of things.
And I stimulate the hypothalamus, then the animal will engage in operant
behavior to get that stimulation. In fact, it's a very powerful reinforcer.
If, if the animal's hungry and I have one lever that goes to food and one lever
that goes to stimulation of the hypothalamus, then they'll engage in
pressing the lever to get the stimulation for long periods of time, thousands of
trials. it's a very powerful reinforcing event.
So, he actually said this has something sort of like the pleasure center of the
brain that I'm stimulating to produce very strong operant behavior.
Now, I want to talk a little bit about schedules of reinforcement because I can
have reinforcement at a particular schedule that produces particular kinds
of strength of behavior. I think I've mentioned earlier on the
pretest that if you reinforce the animal every trial, versus reinforce the animal
on the partial number of the trials, that re-, partial reinforcement will lead to
stronger behavior in continuous reinforcement.
And I can schdule in differnt ways. I can have ratio schedules where I
reinforce on the basis of how many responses are made, a ratio.
So, a fixed ratio for example, might be every fourth response I reinforce.
It's every time the fourth response pressing the lever, they get food.
Or I could use a variable ratio where it's on the average, on the average every
tenth response is reinforced, where there might be eight responses this time, 12
responses next time. Variable schedules lead to a higher rate
of responding than fixed schedules. And I'll show you that in a minute.
And we could also have interval schedules where I don't use the number of responses
but a response after some interval of time passes.
A fixed interval 10 second might be the first response made after a 10 second
interval is reinforced. Or it could be variable, on the average
the first response after some interval occurred will be reinforced.
And again, variable schedules lead to high rate of responding than fixed
schedules. And to show you this, I'm going to use a
cumulative recording. This is a pen on a piece of graph paper
that's running through a roller. And every time the animal makes a
response, it blips up one notch. So, the pen would go up one for this
response. Then it's flat.
And if the animal engages in rapid responses, and of course the slope of
this will get higher. Then, every time they get a reinforcer,
then it should, indicates that on the recording.
Let's look at that for the different schedules of reinforcement.
This is variable ratio, fixed ratio, variable interval, fixed interval.
And as you can see, the fixed schedules lead to not a straight rapid rate of
response. In the fixed ratio, for example, often
the animal will pause after reinforcement.
And then start making the responses to get to the ratio, the fixed ratio that's
required. And so variable ratio in the animal does
not do those pauses, and has a much higher rate of responding.
With the interval schedules, the same thing.
With the interval schedules, let's say it's a fixed schedule of ten seconds, the
animals start out at a very slow rate of responding, and then gradually increase
their rate of responding up to the point at which the reinforcement will occur.
With the variable interval schedule, you don't see those changes within the rate
of responding, you have a continuous high rate of responding.
So, variable schedules will lead to higher rate of responding than fixed
schedules, and all of these schedules will be, lead to more responding than if
you just reinforce in every trial. Thank you.