0:00

Welcome again to the course on audio signal processing for music applications.

Â This week we're talking about applications.

Â We're talking about how to use the models we have been studying

Â throughout the course for the application of sound transformation.

Â So we aim at manipulating sounds and changing the different aspects of it.

Â 0:25

In the first demonstration class we exemplified the idea of morphing

Â using the short time transfer.

Â In the last class, we talked about time scaling,

Â how to change the duration of the sound using the sinusoidal model.

Â And in this class, I want to talk about pitch changes,

Â how to change the frequencies of a sound.

Â And we will use the harmonic plus stochastic model.

Â So we'll be basically change pitch related information of harmonic sounds.

Â 0:58

In order to use the harmonic model we need to understand a little bit this sound.

Â So for example we will start with this saxophone sound.

Â Let's listen to this

Â [MUSIC].

Â Okay, in order to define, especially the window size,

Â we need to know the ranges of fundamental frequencies that are present here.

Â So a good way to do that is to look at the spectrogram of the sound.

Â And basically zoom in to

Â the first harmonic so that we see basically the fundamental

Â frequency which is the first line of this harmonic series.

Â And kind of see which is the highest and lowest values in here.

Â 2:16

and then the highest is this node here which is around 600 and something hertz.

Â Okay, so this is good information for

Â now defining the parameters of the harmonic plus stochastic model.

Â So let's go to the SMS tools model GUI and

Â let's go directly to the harmonic plus stochastic model.

Â 2:59

And now in order to decide the window size,

Â well it's good to basically go to terminal and

Â from a Python we can just quickly do calculations.

Â So for example we can just say, okay the blackman window has

Â a six advancing the main node, we multiply by 44,100.

Â And we said that the lowest frequency was around 400 something hertz,

Â so in order to be safe, let's say okay, 400 hertz.

Â So 400, and this is the window

Â size that is appropriate for a frequency of 400, the lowest which is

Â the meaningful one because it's the longest window that we will need.

Â Okay so we will put as window size let's say 661 our size.

Â FFT size let's make a big one so we have zero padding let's so 2048.

Â The threshold it really doesn't need to be that low, but let's leave it.

Â So we have a lot of harmonics there.

Â The minimum duration of sinusoidal tracks 41 that's fine.

Â The maximum number of harmonics.

Â The maximum number of harmonics that there will be will be 44,100 divided by 400,

Â okay that would be,

Â if it had all the harmonics it's 110 but of course this is the lowest frequency and

Â this is really if we would have harmonics all the way through.

Â So 100 would be fine, then we need to define the range of the fundamental

Â frequency so we can put the one we set.

Â It was around 400 and the other was around 600 and something,

Â so to be safe, let's say 650.

Â This is the nearest threshold to identify the fundamental frequency.

Â Maybe let's be a bit more flexible and put seven.

Â And this deviation, that's fine like this,

Â and the stochastic approximation for the residual, we

Â 5:18

Okay, so this is the result, we have the original signal,

Â the analyzed, the harmonics plus the stochastics and the synthesized.

Â Let's listen to the different components of it, the sinusoidal component.

Â [MUSIC]

Â It clearly captures most of the sound.

Â Then let's listen to this stochastic.

Â [SOUND] Well, it's very soft, but it's there so it's a relevent component.

Â And of course, the sound of the tool.

Â [MUSIC]

Â Okay, so this is a good starting point to now run the transformation.

Â So let's go to, let's quit this.

Â 6:19

Okay, so this is the GY for the transformations.

Â And let's go directly to the HPS model with the transformations.

Â And well, it's already by default the sax phases here.

Â So let's use the parameters that we use.

Â If I remember, it was 661, we did FFT of 2048.

Â The threshold was minus 100, minimum sine

Â duration was that, number of harmonics 100,

Â these minimum frequency was at 400 and maximum was 650.

Â F0 detection, the F0 error threshold was seven and

Â the stochastic factor we put 0.4.

Â Okay now we can analyze And

Â this we'll definitely do the same thing that we did before.

Â So we can check that the analysis is correct.

Â [MUSIC]

Â And that's exactly the same sound that we heard before.

Â So now we can start playing around with the transformations.

Â And we have two Possibilities for

Â changing the frequencies and one for changing the time.

Â So for the time, we're not interested in changing the time so

Â let's say the time as 0, 0, 1, 1.

Â So that means that it's not changing anything.

Â Okay, now in frequency scaling, we have two frequency transformations

Â 7:52

given that we are in a harmonic sound.

Â We know where the harmonics are, and

Â that's a great advantage compared with the sinusoidal models.

Â In fact, these type of changes could be done with the sinusoidal model but

Â of course then, we are restricted to some transformations.

Â And for example the frequency stretching is not possible with the sinusoidal

Â model because we don't know which sign should correspond to the which harmonic.

Â Okay let just first maybe let's just use the scaling first so

Â let's have here again without any transformation.

Â So if you put 0111 that means that there is a frequency stretching of one so

Â it means no where stretching at the beginning and at the end.

Â And then in the frequency scaling let's start with by downloading or

Â sort of decreasing the pitch of this sound.

Â For example, 0.8 and so at time zero we will have 0.8 and

Â at time one we'll have also 0.8, okay?

Â And a very important parameter is this temper preservation.

Â This temper preservation what it does is it

Â tries to preserve the shape of the spectrum of the harmonics.

Â If we put one, it preserves the harmonic shape.

Â So it should sound more natural than if we put zero, in which zero would just

Â transpose everything and so the magnitudes will be affected.

Â So let's apply like this.

Â 9:36

So, let's listen to the result.

Â [MUSIC]

Â So it sounds quite natural even though we have transpose.

Â Mainly because of this timbre preservation,

Â we have maintain quite a bit this quality of the saxophone.

Â 9:54

And then just to finish, let's make some frequency stretching.

Â So frequency stretching is kind of to convert a sound into

Â an enharmonic type of spectrum in which we are adding

Â an exponential factor to the harmonic value, let's say.

Â So we have at time 01, let's say, let's start with one.

Â And then at the end, let's stretch everything to let's say, 1.1.

Â Okay, so we will have a stretching factor then, and not at the beginning so

Â progressively the stretching will increase.

Â So let's see what that does.

Â 10:52

keep getting apart from each other more and more as the time goes on.

Â And clearly at the end they are not equally spaced,

Â so that's a, enharmonic spectrum.

Â Let's listen to that.

Â [MUSIC].

Â Okay, so clearly the low frequency is the same but

Â as time progresses the sound sounds more enharmonic,

Â kind of more metallic because the harmonics have been stretched.

Â Of course we can do a lot of things.

Â So feel free to play around with these parameters and

Â of course with time scaling.

Â Time scaling is also very powerful once we have been able to analyze the sound

Â with the harmonic plus a stochastic model.

Â 11:46

the idea of changing the pitch or the frequencies of a sound.

Â First, we use SonicVisualiser to understand the sound.

Â And then we use the SMS tools UI with the harmonic plus the stochastic model

Â to change the pitch or the frequencies of a sound.

Â 12:03

And so we have been talking about pitch change.

Â Of course, pitch change can be done with the sinusoidal model, can be done with

Â the harmonic plus stochastic, or the sinusoidal plus stochastic.

Â Or with quite a few of the models we have been talking about.

Â And in Audacity also there is some implementations for that.

Â So anyway, so we just presented a little bit of that, an example using the harmonic

Â plus the stochastic and the potential for this type of transformations.

Â So I hope you got an idea of that and now we'll have still another demonstration

Â class and we'll be talking about the harmonic plus stochastic model.

Â But in another type of possibility of transforming sounds which will

Â be of morphing to sounds, interpolating the two representations of tools sounds.

Â So I hope to see you all in next class.

Â Bye-bye.

Â