Thursday, 27 October 2011

Applying Envelopes

Due to the lack of school I suddenly have a lot of time to use wisely burn, so I've been able to do a load of stuff in a short space of time.  As a result, there's loads I want to write about, but I guess I should keep it chronological, so I'll just start of with one of my first little projects.

The title sounds a little misleading - the application of envelopes? Surely to send letters...  But of course, we're referring to musical terminology : an (amplitude) envelope describes how a wave's amplitude varies over time.  The following envelope might describe a sine tone that dies down to nothing.
A very simple envelope going from loud to quiet
When we say we're "applying" an envelope to a sound, we're basically fitting the waveform's overall amplitude to the wave.  Obviously, the wave is still free to go about oscillating, as a good sound wave should.  In other words, the envelope will scale the wave - if the sample at 1 second has a value of 1.0 and the envelope has an amplitude of 0.5, the resulting sample will be 1.0 * 0.5 = 0.5.  If the 1.4 second sample has a amplitude of 0 and the envelope has a value of 0.8, the resulting sample is simply 0 * 0.8 = 0.  If you're still finding it hard to picture, here's an actual picture:

The envelope applied to an audio signal
The most common envelope for synth users is the ADSR envelope, which is applied whenever you play a note.  When you press the key in a synth, most likely the amplitude won't go from 0 to 1 instantly, instead there will be a gradual build up to the peak.  This is the "attack" part of the envelope, and when I say gradual, it may be really quick, like 0.1 seconds.  On a slow synth pad, this effect may be more obvious.  Similar to being attacked in the street, this is the stage will leave an impression (on the victim or listener), which is followed by a decay (reduction in loudness) and then a sustained (fight).  A much better analogy would be the plucking of a guitar string - the initial noise or attack, is louder and more distinctive than the constant sustained note that you get by holding the note.  After you let go of the string, or take your key of the synth, the noise will gradually die down - this is called the "release" stage.  Together - these attack, decay, sustain and release phases combine into a single envelope that is applied to the amplitude of any note played.

Finally, I'm going to move away from theory, and onto something I actually did.  This stuff is actually almost a month behind me, and since then I've done some much more interesting things, but anyway, got to start somewhere.  One of the first little programs I wrote was one that applied an envelope to a sound file, and saved the resulting sound file.  First for some design considerations, and possible problems.

1) The audio file had to be supported by a little audio library that got given with my textbook (called portsf - port soundfile)  This means pretty much just variants of the lossless formats WAV, AIFF are supported.
2) The envelope is simply a table of time values with a corresponding amplitude.  If we join the dots we can create an envelope.  This means the envelope can pretty much just be kept as a text file.  The only rule for this file format is the time values must never decrease as you go down the file. 
3) What happens if the audio is longer than the envelope or vice versa?  This is a design decision I (actually the textbook) had to make.  The answer is, if the envelope is longer, then we just cut the envelope short.  If the audio is longer, just use the final envelope value.

The processing of the audio is about as simple as it gets - we read a block of frames from the input file (samples, but taking into account the possibility of multiple channels - think stereo), multiply the samples with the envelope value, then output the frames to the output file.  The tedious work of reading and writing to an arbitrary WAVE / AIFF is however not simple, mainly because they come in loads of different permutations of formats.   This is why I used the textbook's supplied library, called portsf.  In my code snippets, if functions come from nowhere, they've probably come from the nifty library.  In terms of audio programming, libraries are our friends - think the VST SDK or ASIO SDK, for libraries that have widespread use in the music industry.

As you will now be aware, some programming jargon is creeping in.  I'm not going to neglect the programming side, so if you're more into the theory behind it all rather than programming / C, C++ specifics, there are bits you can just skip out.

Here is some real life code I used to make my "Envelope Follower", taken from the main bulk of the processing function:

frames_read = psf_sndReadFloatFrames(ifd, in_frame, BUFFER_SIZE);
printf("Frames processed: \n");
while(frames_read > 0)
    total_read += frames_read;
    for(i = 0; i < frames_read; i++)
        /* if there are more breakpoints to process then 
        calculating the amplitude, either from a point or 
        interpolation, if not, leave the output file unchanged */
            /* take the next envelope value */;
        in_frame[i] = (float)(in_frame[i] * curr_amp);
    /* write the frame to the output file */
    if(psf_sndWriteFloatFrames(ofd, in_frame, frames_read) <= 0)
        printf("Error: problem writing to output file.\n");
        total_read = -1;
    printf("\r%d", total_read);
    /* read the next frame */
    frames_read = psf_sndReadFloatFrames(ifd, in_frame, BUFFER_SIZE);

Lines 13, 17 and 25 are the really key parts of the loop - psf_sndWriteFloatFrames() does what it says on the tin, but is defined in the portsf library.  I've cut out quite a big chunk around line 12, one because it's not so important in seeing the structure of the code and two because I don't want to infringe my textbook's copyright.

Instead I'll explain what goes on there - in our envelope file we could have two points with the same time, indicating a sudden jump.  Otherwise, we could have a set of points wide apart, and we may want to take the envelope value in between these two points.  To do this we use linear interpolation to guess where this value should be.  If linear interpolation isn't English to you, try this link.

Obviously this code on its own won't be any use in terms of reconstructing the program, but hopefully it's given you an idea of the inner workings of a program like this.  Next up will be the reverse of this program - extracting envelopes - and I'll put all the audio stuff in one go.

No comments:

Post a Comment