0:00

Hello, and welcome back to introduction to genetics and evolution.

Â In the previous two videos we talked about the effects of genetic drift

Â over single generations and over multiple generations,

Â just to stress some of the points I raised there a single generation of genetic drift

Â is about equally likely to make alleles increase as decrease in frequency.

Â Okay?

Â Over multiple generations, genetic drift can be somewhat predictable,

Â in that we expect the probability of eventual fixation,

Â eventually getting to 100%, for any variable allele,

Â is equal to the frequency of that allele in the population.

Â What we're gonna do now is look at the effects of genetic drift on the rate

Â of neutral molecular evolution.

Â And this is something that's fundamentally important.

Â And will tie in with our next set of videos on molecular revolution.

Â And again, are long-term effects of mutations and genetic drift predictable?

Â We've already talked about mutations arising at some rate.

Â That there's some predictable rate at which new mutations can occur.

Â 0:58

Some parts of the genome will get mutations that have no effect on fitness.

Â Or some mutations will arise in the part of the genome that can have an effect on

Â fitness but the mutation itself doesn't effect fitness.

Â In these cases these mutations are referred to as neutral.

Â They have no effect on fitness after they arise.

Â They can spread by genetic drift, or whether it would be lost by genetic drift.

Â Now, the question is can we predict the rate at which they both arise and

Â spread to fixation?

Â This is ultimately what will lead to differences between species, right?

Â That a new mutation rises, it spreads in one lineage, gets to 100%.

Â That makes that lineage different from other lineages because it has this

Â unique variant there.

Â Or can we determine this rate at which they arise and spread.

Â Well the tricky part is the ancient population sizes are unknown.

Â So that makes it seem like this would be very challenging.

Â Let's break this up into pieces.

Â 1:53

So mutations are arising and

Â let's say they arise at a rate which we will refer to as mute.

Â Is the Greek letter mu, so this mu can be measured, perhaps as mutations per year or

Â mutations per generation.

Â We'll focus primarily on the per year side, of this figure.

Â So let's imagine that that mutation rate is one times ten to the minus nine,

Â mutations per year per basepair studied.

Â That's not a crazy estimate that's about what you'd expect to see.

Â In larger populations, you're more likely to get the mutation

Â simply because there's more alleles present, right?

Â That every chromosome out there has some probability of getting the mutation.

Â The more chromosomes you have, the more chance is that the mutation will arise.

Â So the rate of getting a new mutation in a population might be two and mu.

Â Al right, so the 2N is the number of chromosomes cuz N is the population size.

Â Two because it's diploid.

Â Every individual has two copies of it.

Â And mu is that rate per chromosome.

Â So there's a rate per yer per base per study on an individual chromosome.

Â 2:52

Now, the mutation must also fix by genetic drift.

Â It has to go from this rare starting frequency all the way up to 100%.

Â So what is the probability of fixation of a new mutation in a diploid?

Â We talked about the probability of

Â fixation of alleles by genetic drift, right?

Â Well let's put these two pieces together.

Â 3:14

The probability of a new mutation arising, 2Nmu.

Â The probability of a new mutation fixing will be equal to its starting frequency.

Â The starting frequency of a new mutation will always be 1 over 2N.

Â This is very important, 1 over 2N.

Â Because this new mutation has arisen in the population,

Â the population is diploid and there's only one copy of the new mutation.

Â So it's one mutation in this population of two N chromosomes.

Â 3:43

Right, so this is it's starting frequency and as we said before,

Â by genetic drift alone this is a probability that it'll eventually fix.

Â Right, the probability of fixation is equal to the allele frequency.

Â So what we we're saying is the probability of new mutation arising

Â times the probability of new mutation fixing.

Â When we put these things together we have a mathematical convenience.

Â 2Nmu times 1 over 2N, we can cancel these out.

Â Is equal to mu.

Â [LAUGH] So this is really cool because large populations have

Â more chance a mutation will arise But a smaller chance that it'll

Â fix by genetic drift because the allele frequency at the start is so much lower.

Â In contrast, smaller populations have a lower chance the mutation will arise but

Â have a higher chance it'll fix because that starting allele frequency is high.

Â Because of this amazing cancelling out,

Â the rate of mutual molecular evolution does not depend on population size.

Â This was first described by Motoo Kimura, his picture is shown here.

Â So how can we use this calculation?

Â 4:45

Well, here's an application for it.

Â Let's say that we know the mutation rate of a particular region.

Â Let's say we know the mutation rate for human pseudogenes is roughly

Â one times ten to the minus ninth mutations per year per base pair.

Â Okay?

Â So let's say we want to know the divergence time between humans and

Â mouse lemurs.

Â There's an interesting picture of a mouse lemur over here.

Â So what we do is we sequence a a particular pseudo gene.

Â A pseudo gene by the way, is a gene that is no longer functional, so

Â it's assumed that mutations that arise in it are going to be neutral.

Â They're not going to have any affect on fitness.

Â You sequence the pseudogene and you fine 150 base differences

Â in 1,000 base pairs between the human and mouse lemur, okay?

Â This is not unusual.

Â You expect several DNA sequence differences between humans and

Â mouse lemurs.

Â We're not that closely related, but we can use this to determine how far back

Â humans and mouse lemurs shared a common ancestor and we show you how we do this.

Â 5:39

So again we have this rate of one times ten to the ninth

Â mutations per base person per year.

Â Now in this case we said we're looking at a thousand base peers.

Â So our probability of getting mutations is higher.

Â It'll be a thousand times more.

Â So we can say one times ten to the minus sixth mutations in a thousand base peers.

Â I just multiplied 10 to the minus 9th times 1000.

Â So, 1 times 10 to the minus 6 mutations in 1000 base pairs per year.

Â And what you can do is basically just invert this.

Â Okay?

Â So, we should say that for every 1 mutation,

Â 1000 base pairs, we can say it's been about 10 to the 6th years.

Â I just inverted the numbers up here.

Â 6:17

So, we saw 150 mutations, so 150 mutations times 10 to the 6th years per mutation.

Â So that comes out to 1.5 times 10 to the 8th years total divergence.

Â This seems like it should be the right answer, right?

Â Cuz this is how long this should have taken for us to get this 150 mutations.

Â The problem is there's two batches.

Â Here's our common ancestor in time which today Here's long ago.

Â 6:44

So we have this change over time.

Â We have these 150 mutations that distinguish us.

Â Now some of these mutations are on this lineage.

Â But some of them are also on this lineage.

Â So when we're saying this 1.5 times 10 to the 8th years,

Â we're actually summing both of these things together.

Â So what we need to do is we actually need to divide by two.

Â So we take one point five times ten to the eighth years divided by two and

Â that becomes seven point five times ten to the seventh years.

Â Or the time to the ancestor will be 75 million years ago.

Â So as long ago, we can say, is roughly 75 million years ago.

Â Okay.

Â Take a second just to look that over then I'll give you one to try on your own.

Â So you start with this mutations per base pair per year, that was a given.

Â We then looked at how big a sequence we're looking at, 1000 base pairs.

Â We flipped this number around, basically from 1 times 10 to minus 6 to 10 to

Â the 6th and basically just changed it so yours was in the numerator.

Â And one mutation to the denominator, that made it ten to the sixth.

Â So for every one mutation, I have to wait ten to the sixth years.

Â We have 150 mutations, so

Â we multiply that times 10 to the 6th is where we get 1.5 times 10 to the 8th.

Â Okay, divide by two because mutations are rising along both lineages.

Â We're not looking at base differences between humans and the ancestor,

Â we're looking at base differences between humans and mouse lemur,

Â that's why we have to divide by two.

Â And therefore we get 75 million years.

Â Here's one for you to try.

Â Here's the time to.

Â What is the time to ancestor for a human to tamarin?

Â Well, let's assume the same mutation rate there.

Â Let's say in this case, you screen 10,000 base pairs of sequence.

Â Okay. Just so you're not using exactly

Â the same numbers.

Â Let's say you try 860 mutations.

Â What would the divergence time be?

Â 8:32

So this is just filling in those same things.

Â So we said that we're looking at 10,000 base pairs,

Â that's the same as 10 to the 4th base pairs.

Â So, 10 to the 4th times 1 times 10 to the minus 9th, so

Â that would be 1 times 10 to the minus 5th mutations in 10,000 base pairs.

Â So, all you do is multiply 10 to the minus 9th by 10,000.

Â Then I invert this whole thing so I have years in the numerator and

Â mutations in the denominator, so I have to wait 10 to the 5th years for

Â every 1 mutation in this 10,000 base pairs.

Â Okay, so this is a longer stretch.

Â That's why we don't have to wait as long for it.

Â We have 860 mutations, so multiply these two together, and

Â we get 8.6 times 10 to the 7th, or 86 million.

Â And, again, this 86 million is reflecting what's happened in this branch and

Â what's happened in this branch, so we divide by two.

Â So, times our common answer would be half that or 4.3 times 10 to the 7th.

Â Or 43 million years ago.

Â 43 million years ago is when we may have shared a common ancestor with humans

Â and tamarin.

Â 9:34

Now, several people have told me they're very interested in these divergence

Â time estimates and calculating them from molecular data.

Â They're interested in looking up some published divergence time estimates.

Â I refer you to this website timetree.org, they

Â also have a free iPhone app where you can just type in your favorite two species and

Â see what the estimated divergence time is between them.

Â So take a look at that when you have the chance.

Â And I'd like to do a little segway into what's going to be coming up in the next

Â set of videos.

Â 10:00

Now nucleotide variation exists within species and between species.

Â So let's say for example your sequence stretched from

Â four individuals of species one, four individuals of species two.

Â So it's maybe human and tamarin for example.

Â There's some bases, like for example, base number one here, where every individual

Â from species one differs from every individual from species two, right?

Â All individual Species 1 have C.

Â All individuals from Species 2 have G.

Â So this is some sort of fixed difference.

Â Okay?

Â We may have cases where one species is very well and the other is not.

Â That's what we see with both bases two and three.

Â In this case, this particular site labeled number two is variable in

Â Species 1 where it is invariance in Species 2.

Â In this one right here,

Â this one is variable in Species 2, but invariable in Species 1.

Â It's variable but there's only one rare variant here.

Â At least among these individuals.

Â 10:50

Now, a big question out there, we see this variation within species,

Â we see a variation between species.

Â So between species with these, these would be within species.

Â Question is, where does this come from?

Â Some mutations are advantageous and

Â we expect those to spread within species and could always spread fairly quickly.

Â Many mutations are bad.

Â And even if they are bad they may still be found in the population for

Â a short period of time.

Â We saw a genetic drift in particular can allow some bad mutations to stick around

Â for quite awhile.

Â [COUGH] So, the question is, how much of the genome actually evolves solely by

Â mutation and genetic drift, in its purely neutral fashion?

Â Right. How much of this is actually being

Â affected by natural selection?

Â There's two, sort of, schools of thought that have been around since 1960s or so.

Â One is the Neutralists school of thought.

Â And that is that most of the nucleotide variation

Â that you see present within species, tends to be neutral.

Â Most of the variation you see there within species tends to be neutral.

Â In contrast,

Â Selectionists suggest that very little nucleotide variation is neutral.

Â If you see multiple variance it could be something like, for example,

Â over dominance.

Â Or it could be that particular variances are selected in this population and

Â other variances selected in that population so both types stick around.

Â How much information that's out there is actually selected?

Â How much of it is neutral?

Â This is a very big question, and

Â it's not something to which there are very clear answers to just yet.

Â We'll come back to this when we start looking at patterns of

Â molecular evolution.

Â Hope you'll join us.

Â