0:00

Hello, welcome to the course on Signal Processing for Music Applications.

Â This week we introduce the concept of stochastic signals and

Â a way to model these type of signals using spectral approximation.

Â So in this demonstration class, I want to actually use an implementation of these

Â model of the SMS tool set package and see a how it works.

Â 0:43

very noisy, so that would be quite good for this type of approximation.

Â In terms of parameters, the default are okay, hamming window,

Â window size 124, hop size 512, let's first listen to the sound.

Â [SOUND] And now let's compute the stft,

Â okay, so here we see the spectrum

Â of this ocean sound and clearly

Â the kinds of things we expected.

Â In the sense that the magnum spectrum is very granular and

Â there is very little kind of repetitive structure.

Â The only kind of structure we can see is this overall shape

Â 1:44

being shown here as this increase of this red area.

Â And also we see that the higher frequencies are softer than

Â the lower frequencies so there is much more emphasis on the low frequencies.

Â And in the phase spectrum we basically see a random numbers that is nothing

Â particular here that we can see in terms of a given structure.

Â And if we zoom into the phase spectrum,

Â definitely it will corroborate this idea that these are basically random numbers.

Â If we do it on the magnitude spectrogram, maybe one of the areas

Â that are more sort of stable, and it's also very noisy,

Â of course in here there is some more overall trends that we can see.

Â 2:38

But now let's see what we can do with a sinusoidal model

Â that we have been talking about and using.

Â So let's take the sine model and let's get the ocean sound,

Â okay, in here we definitely will meet a lot of sinusoids in

Â terms of the window size, I don't think it matters too much but

Â maybe let's take a smaller window let's say 1,000.

Â Here the real scenes the phases do not matter that we

Â can really just say 1,024 and FFT 1,024,

Â there is no need for doing anything different from that.

Â The hop size will be one fourth of that so that will be fine,

Â magnitude threshold minus 80 well the duration of the sinusoids here

Â clearly we need to account for a tiny sinusoids that will come in and out.

Â And in terms of the number of sinusoids, well we need a lot, so lets put 200 hertz.

Â 4:00

Okay, so this is what we get, again maybe what we were expecting,

Â the sine waves these are the frequencies, they are all over.

Â We just see that they are scattered all over again

Â in a very kind of granular random way.

Â Let's listen to the sound that they synthesized from the sinusoidal model.

Â [SOUND] It doesn't sound too good,

Â it has these kind of tonal quality and

Â we hear pitches that were not really in the regional sound.

Â This is why the sinusoidal model may not be that appropriate for

Â this type of sound.

Â You can push it more and make it sound closer to the ocean sound, but

Â clearly it's not an appropriate modeling approach for this sound.

Â So now let's go to the stochastic model and let's open the ocean sound, okay.

Â And in here there is not that may parameters to choose,

Â one is the hop size and the FFT size will basically

Â be twice as that, so there is no need to control that.

Â 5:17

And then there is an important parameter which is this smoothing approximation

Â factor which is basically how much smoothing we're going to perform.

Â For example we can start by zero point one that means that we're going to reduce

Â the size of the FFT by 90%, we're going to have the result is 10% of the overall.

Â So that means that we're going to have only one every ten beans or

Â frequency samples in the frequency domain.

Â And of course, the phase spectrum will be random numbers, so

Â let's listen while it's compute that.

Â 5:56

Okay, that's quite fast and here, of course, what we are seeing and

Â maybe we can compare it with the original stft..

Â We have magnitude spectrum which is much coarse,

Â so there is much fewer horizontal blinds because we have

Â 6:17

the FFT size that was down sample so basically it was smoothed out.

Â Okay, but let's listen to the output sound.

Â [SOUND] Well it doesn't sound like the original,

Â let's listen again to the original.

Â [SOUND] But it definitely sounds like some water,

Â so the quality is very much the same It sounds

Â with that kind of high pass kind of thing.

Â So let's try to get a little better,

Â let's have a hop size smaller than this one, maybe 64.

Â Because these time changes are important, and maybe let's not

Â reduce this that much, like point five and let's see what happens.

Â Okay, now we have more information we have a finer grain both in the horizontal and

Â vertical axis so let's hear what is the result.

Â [SOUND] Yeah, these sounds a little bit better and

Â we can play around with these parameters

Â to get different types of approximations.

Â Now to finish this let's try how this approximation works

Â with the sound that is not really completely stochastic.

Â For example let's open, the speech sound, let's open, this speech male sounds, okay.

Â And let's not do so much maybe let's do 256 and let's

Â do maybe point two as an approximation, let's first listen to the sound.

Â >> Do you hear me?

Â They don't lie at all.

Â >> Okay, so now we're going to attempt to approximate these we're going to get

Â rid of the phase spectrum, we're going to make it random numbers.

Â And we're going to smooth out the magnitude spectrogram, so

Â let's see how it sounds.

Â 8:26

Okay, so this is the approximation, so this was the original sound.

Â This is the stochastic approximation, we see a very coarse type of approximation

Â to the magnitude spectrum and this is the resynthesize sound,

Â let's listen to that >> Do you hear me?

Â They don't lie at all.

Â >> Okay, that's very interesting, in fact, it sounds like a whisper type of sound,

Â we have lost all the pitch information, because a lot of this pitch

Â information is in the phase spectrum, and we have basically got rid of that.

Â And since the magnitude spectrum is quite smooth, also we got rid of quite a bit of

Â possible pitch information that was present in the magnitude spectrum.

Â 9:09

Okay, so that's all, that's what I wanted to show.

Â So, we have looked at one implementation

Â of this stochastic approximation that we have talked about within the sms-tools.

Â We have this code that approximates the sound using this model

Â and well we have used some sounds from Free Sound.

Â So hopefully that has given you a flavor of what does it mean to

Â approximate a sound with the stochastic modeling approach.

Â And of course we're going to be using that for the residual of some signals.

Â And so in the next demonstration class, we're going to put it together with

Â the other models as we have been talking about.

Â So we'll see you next class, bye bye.

Â