0:05

This week, we're talking about how to model sounds by combining the sinusoidal

Â or harmonic approach that we talked in the previous weeks with

Â the idea of a residual or stochastic component.

Â And the core aspect of that is how to subtract

Â the sinusoids of a sound from the actual sound.

Â And this is what we are going to be talking about in this programming lecture,

Â from a programming perspective.

Â So we are going to be implementing the subtraction of sinusoids from

Â signal and we're going to do it through the hprModel,

Â the implementation of the model that we have within the SMS tools package.

Â So basically, we're going to be talking about this blocked

Â diagram in which from the input sound and

Â all the analysis that we have already been talking about.

Â The computing of the spectrum,

Â detection of the peaks in the case of the harmonic models,

Â detection of the fundamental frequency and identification of the harmonics.

Â Then we are synthesizing these harmonics in the spectral domain, so

Â we're synthesizing the lobes of the harmonics.

Â And now in this week, what we are focusing on is this subtraction.

Â So we're focusing on how these sinusoids

Â can be subtracted from the original signal.

Â And what we do is we compute again

Â the spectrum of the input signal with the same parameters.

Â So therefore with the same window that is implicit or

Â in fact is quite explicit in the spectral domain of this sinusoids,

Â so that then we can subtract them.

Â And then this residual spectrum can be sum to the harmonic spectrum and

Â obtain the time domain synthesize signal by combining the two.

Â 2:30

So first we import all the packages that we need for this analysis, okay?

Â We already have seen all of these packages and then we read a sound file.

Â So we're going to start from the oval sound on the A4 and we need to specify

Â some parameters, in particular we're going to be analyzing only one frame.

Â So we choose the location,

Â the pointer where we're going to be reading the sound from, so

Â the place for the 1,000 sample.

Â 3:09

And then, well we need the window size m, we take 801 samples,

Â we need an FFT size bigger than that, so we take 2,048.

Â And with the threshold for identifying the peaks and

Â since we're going to be doing harmonic analysis, we need a minimum or

Â maximum frequencies to look for the the fundamental frequency.

Â Then we need a nearer threshold to have the lower bound for

Â the algorithm that we are using for F0 detection.

Â The two waves mismatch error function then we will decide how many harmonics,

Â the maximum number of harmonics that we are going to be identifying.

Â And this is the deviation slope that we had choose so that

Â the higher harmonics have a higher degree of deviation than the lower ones, okay.

Â So these are the parameters and now, we start by computing things.

Â So first, we compute the analysis window and we take the Blackman window,

Â 4:19

then since we're going to be doing this subtraction,

Â that phase information is fundamental.

Â We need to do zero phase window, this is where it's

Â fundamental the idea of centering everything around zero, so

Â that the time domain basically wave forms they align perfectly.

Â So that the faces match and therefore the subtraction is possible

Â and here the outside window is again fundamental, so

Â that's why we need to compute the center of the window in this way.

Â Okay, then we choose the fragment of the sound and here the pointer.

Â So we choose the center of the window and

Â we take half of the samples from the left of this pointer and

Â half of the samples on the right of this pointer.

Â Okay, so this is going to be our sound that we analyze and

Â then we have seen all these before.

Â We compute the DFT of this fragment of the sound, we find the peaks,

Â we interpolate the peaks with parabolic interpolation and here we convert

Â locations to hertz, and now we detect the fundamental frequency.

Â Okay, so we called a two way mismatch algorithm and

Â it returns the best candidate for fundamental frequency.

Â And we identify all the harmonics by calling their harmonic

Â detection function that looks for the peaks that are closest to the multiples on

Â the fundamental frequency, okay.

Â And now, we can synthesize this harmonics that we have identified and

Â we synthesize it with another FFT and window that we did the original analysis.

Â So now, we take an FFT of 512 and half of that is 256 and

Â we're going to be synthesizing the spectral signs,

Â so the main slope of the blackman window, okay.

Â And this is the yh is the complete spectrum of this harmonic component.

Â 6:40

And now this is what we are basically focusing on this week,

Â we have to subtract these harmonics from the original signal.

Â But to do that, we need to recompute the spectrum of the original signal, so

Â that we use the blackman harris window that is in this spectrum.

Â 7:20

of the fragment of the sound but only the 512 samples around the pointer.

Â So here, we choose another original signal with these 512 samples and

Â multiple it by the blackman harris window normalize,

Â so that then it becomes easier to compute, to do the subtraction.

Â 7:50

Okay, and so this our new input signal and then we have to zero phase window 8.

Â So we have to put it around zero and we do that by defining the FFT buffer,

Â we're going to use and center everything around zero, and

Â then we can compute the spectrum of that.

Â And the subtraction of the sinusoids then becomes easy,

Â it's just a complex subtraction.

Â We subtract the harmonics,

Â the harmonica spectrum from these new spectrum that we just compute.

Â So okay, so let's run this and let's step through the different variables

Â that we have been generating, so this is the file called test three.

Â 8:39

Okay, so this has computed it and now, let's keep looking at the different

Â variables, so for example, let's plot the x1 sound, okay.

Â This is the fragment of the sound we are analyzing, okay,

Â then we can plot the resulting spectrum that we computed from that,

Â so we can plot mx, okay?

Â So this is the in DV, the magnospectrum of that.

Â 9:10

Then it has compute out of that peak, so we can even just print

Â the locations of the peaks, so iploc is the peaks or

Â the peaks that it has found within the array of the FFT.

Â It's better to show it in Hertz so we print IP

Â frequency, is the frequency of the peaks it has found, right?

Â Then this has gone to the F0 detection,

Â so it has identified the fundamental frequency.

Â And the fundamental frequency has been chosen to be 443 Hertz,

Â which makes sense, we're analyzing a novo sound, an A4.

Â And then out of this fundamental frequency it has chosen

Â the peaks that are harmonics of these, so each frequency is

Â the set of harmonics that it has identified in this particular location.

Â So it has identified all these harmonics,

Â the other peaks have not been considered harmonics.

Â Of course, we have chosen a threshold, and

Â a given set of parameters that has limited the number

Â of harmonics to these 6,000, so it becomes easier,

Â so we just have analyze up into this harmonic.

Â Okay, then we generate these harmonics as a spectrum and

Â so it has generated yh, so we plot the absolute value of yh.

Â We're going to see the magnitude

Â of the complete spectrum, but

Â let's plot it just in DB so we will just

Â plot(20*log10(abs(Yh.

Â And let's just take only, let's say the first 70 samples,

Â so from the beginning to the sample 70.

Â So we focus on the first harmonics,

Â which are the ones that basically we have generated, okay.

Â So these are the harmonics that

Â we basically we have the synthesized spectrum.

Â On top of that we can plot the signal that

Â from the original sound that we have recomputed so x2, the x2 spectrum.

Â Okay, so we plot on top of that the absolute value of x2,

Â 12:04

in here, we now see the green line which is basically the original spectrum,

Â and as you can see, it's very much similar.

Â So, that means that we very much have synthesized the original spectrum.

Â And then we can now plug the subtracted spectrum, so

Â the subtracted is the xr spectrum and this is this red line.

Â Okay, so the red line is the subtraction of the two and

Â it clearly shows the residual that it has.

Â Okay, so this works and basically we do that at every frame and

Â then of course we can do the inverse to generate all these signals back.

Â 12:51

Now let me show you the actual code that is in the SMS tools package

Â that performs these harmonic plus residual modelling.

Â So there is this file call hprModel.py within the models directory,

Â underneath there is the analysis and synthesis of HPR modeling.

Â There is one function that does the analysis,

Â another one that performs a synthesis, and then there is another one that

Â does both the analysis and the synthesis at one frame at a time.

Â 13:27

In fact, we recommend very much to do the analysis and synthesis separate so

Â that we can take advantage of cleaning the trajectories and

Â having some memory in the tracking.

Â So that the harmonics are better, and

Â in the analysis part of the function, it basically

Â calls two functions, one that we already have seen.

Â It does the harmonic analysis in the same way that a we saw it when we talked about

Â the harmonic model and then the new thing that it does is the subtraction.

Â So there is this function, sine subtraction, that has this input,

Â the harmonics identified by the harmonic model.

Â Has it input also the input sound again and

Â it subtracts the harmonics from this input sound.

Â So these function is in the utilFunction directory in the file and

Â in here you will find the sign subtraction function and

Â it basically does what I have explained just a while ago.

Â It goes through the sound and

Â then at every frame it does the subtraction of the sinusoids, so

Â the sinusoids have already been been identified.

Â So here, it iterates over all the analyzed frame, and

Â then it reads again a fragment of the input sound.

Â It recomputes the input sound with this window with the blackman harris window,

Â and then it synthesizes the harmonics with this blackman main lobe.

Â And then it's able to subtract

Â 15:19

the harmonics from the input sound, and that's all.

Â And then it synthesizes the residual signal back and

Â that's the windowing effect, so we can do the overlap at correctly.

Â And that's all, that's what the sinusoidal subtraction does and

Â it returns of course the residual signal.

Â And the output of the analysis is the harmonic frequency's,

Â magnitudes and phases, and this residual signal.

Â 16:22

Then we have this other function, hprmodelfunction.py

Â which is the file that puts it all together and it's the in

Â fact the file that is called from the interface that we have been using.

Â And it's simply, there is one function, main and that does

Â the analysis and synthesis of a sound and it plots the intermediate values.

Â So it calls the hprModelAnal, okay, and

Â then well it computes the spectrogram of the residual so

Â that we can show it as a spectrogram.

Â And then it performs the synthesis and it generates the overall out put sound

Â the sum of the residual plus the harmonics and just the harmonics,

Â and it writes the files into the same directory.

Â 17:21

Okay so now we can execute this file then run this

Â HPR Model function file and it will analyze

Â a fragment of a saxophone sound, which is the default sound with the hpr model.

Â So we will get here, we see the harmonics of the sound and

Â the background is the spectrogram of the residuals.

Â So we have computed the spectrogram of the residuals and

Â here are the two together, and of course we can listen to them.

Â We can just in this directory it has save the residual sines and

Â the sum of the two, so we can just play the for

Â example first the residual.

Â [SOUND] Okay. And then we can play for

Â examples the sines.

Â [MUSIC]

Â And we can play the sum of the two.

Â [MUSIC]

Â Okay, and that's all, basically we have gone

Â through the harmonics plus residual model.

Â And we have used quite a bit of code from the SMS tools and

Â from Python to be able to subtract the harmonics from the residual.

Â 18:48

And hopefully that has given you a programming perspective

Â to this idea of harmonic plus residual subtraction.

Â And it's not a very sophisticated thing from a signal processing point of view,

Â but from a programming point of view, it's quite delicate.

Â We have to be very careful in how to put the signals and

Â how to analyze them in order that they align perfectly.

Â And in order that the complex subtraction in the spectral domain is done

Â correctly and we obtain a real residual, and that's all.

Â So in next class we're going to take the next step which will be to,

Â from a programming perspective look at the stochastic components.

Â So how we can then convert these results that we just computed

Â into a stochastic component and then have the harmonics plus stochastic model.

Â So see you next lecture, bye bye.

Â