Friday, 21 October 2011

Limits to the Madness

Hmm, looking over what I've done, I'll have to cover a tiny bit more theory before I can get stuck into the what I've been working on right now.  I promise the next post won't be theory.  Don't worry, I'll put up some pictures to make it bearable.

Samples are the free things given out at supermarkets.  But they're also very important to audio programming.  In audio programming, a sample is like a snapshot in time.  Film works on the same principle, but is probably easier to visualise.  A piece of film is made up of about 25 frames per second, because the pictures are played fast enough, we perceive a continuous image.  If we're to use this as an analogy, every picture is like a sample.

Due to the nature of sound, for a single second of audio, we need a bit more information.  A sample is a snapshot in time, but specifically refers to the power going through the amplifier at a specific point.  A few weeks ago I foolishly believed samples were some magical entity not to be questioned, but now I've realised all a sample is, is a number.  And audio is just a lot of numbers.

Let me clarify "a lot".  It's fairly widespread knowledge that the average audio file is sampled at 44,100Hz - translation: the average audio file has 44,100 samples (values) every second.  (Apparently DVD audio is sampled at 192,000Hz but God knows what for...)  44,100Hz at first seems like a fairly random number, but on closer inspection, there is a good reason behind it.

The Nyquist frequency is a limitation that you have to be aware of when working with digital audio.  If you'll let me go back to the film analogy, you'll have probably seen in films, where you get a very fast rotating object, it appears to be moving slowly backwards.  This is really noticeable with things like propellers.  The same effect can actually occur in audio, except its really undesirable (because it sounds bad).  It happens when we have frequencies that are at, or over, half the sample rate, because this is when the sound wave starts becoming ambiguous, and actually comes up as frequency, symmetrically below the half-way point.  The Nyquist frequency is half the sampling frequency (rate).  Here's some pictures to clarify.

Here we have a wave, which we've sampled - every black dot represents two values - time (x axis) and amplitude (y axis).  The level of detail sampled is very good - if the original wave were to be taken away we'd still be able to accurately redraw it.

If we imagine every dot to be at equal time intervals for every picture, then this diagram would show a wave with a higher frequency (higher pitched).  Notice how there's a bit less detail for this wave, but still plenty to recreate it fairly accurately.  If you're wondering (or even if you're not), how your computer turns a stream of numbers into soundwaves, it's done through through a device called the Digital-to-Analog converter (DAC).  Go to this wikipedia page if you want to find out more.

This is a wave at a higher frequency still - at the Nyquist limit.  The grey wave shows how our wave has now become ambigious.  In reality, because all the dots (samples) are at 0, there would simply be no sound at all.  This means our wave can no longer be reproduced as intended.

Here is a frequency even higher than the Nyquist limit.  The grey wave is the wave that will be produced by the DAC from the samples, but it's a lower frequency to the one we wanted (the red one).  The grey sound wave is called aliasing - a frequency we didn't want because we were naughty and made a wave with a frequency higher than the Nyquist limit.

So, if we sample at 44,100Hz, then we're actually bound to frequencies below 22,050Hz.  Is this a limitation?  Another significant frequency is the limit of human hearing, which is about 22kHz or 22,000Hz.  Coincidence... I think not.  Obviously, there's no need to store audio information we can't even hear.  (Although no doubt, some audiophiles will claim it sounds "better" if we store it anyway).

Anyway, as mentioned before, if you do go beyond the Nyquist frequency, you'll generate what's called aliasing.  Below is a square wave sweep with some major aliasing, just so you can get a feel.  Be careful with how loud you play these sounds - they are harsh (I've made sure they're relatively quiet though).

Aliasing by Boyley

Both little clips are square waves, but the aliasing should still be very clear.  In the sweep, listen out for a pulsing noise which becomes slower as the sweep goes higher.  In the constant tones, after hearing the sweep, you might be able to pick out the aliasing - you can hear the wave isn't quite "pure".

That concludes this little explanation of some key audio terms and concepts, hopefully you've found it at least a tiny bit interesting.

No comments:

Post a Comment