And then if we deal with polyphonic symbols, for

example this is a fragment of this kinetic piece, let's listen to that again.

[MUSIC]

There are several sound sources but the voice is the most prominent one and

to detect the fundamental frequency of the voice in the time domain is basically

close to impossible.

In the spectrum, well, it's not easy either but we'll see that there are some

algorithms that attempt to identify this prominent voice,

this harmonic component in the frequency domain, and they do a pretty decent job.

All right, so to detect the fundamental frequency in the time domain,

we basically have to identify the length of its repeating periodic cycle.

And the autocorrelation function is a mathematical tool for

finding repeating patterns.

It is the cross correlation of a signal with itself, and informally, we could say

that it's the similarity between samples as a function of a time lag between them.

So in this equation, we see a version of the autocorrelation function that has some

tapering, and what we do is we compute this function for every lag time.

So we try different lag times, where it is an integer values,

a sample value so we start with l equals zero.

And then we sum overall for a particular period of time,

a fragment of a sound multiplied by the sum delayed by that lag time.

Of course if we delay by zero it's the same signal and

if we delay with different lags, the multiplication will be different.

So we will get a function of l, so

therefore we'll be measuring how correlated is a fragment of a sound with

the samples delayed by a certain l.

So let's look at a particular example.

So this is the oboe sound again, and below is the autocorrelation function in

which we clearly see, of course at zero, lag zero is one.

It's completely correlated and

then as the lag increases, and here we have expressed lag time in seconds instead

of samples to make it easier to correspond to the top signal.

And clearly we see that a lag corresponding to one period,

which is this .002, there is a local maximum.