Thursday, 30 August 2012

Robust and General or Bloated and Overcomplex?

After extensive holidaying I've settled down and returned to "The Audio Programming Book" for more intense digital signal processing.  I'm currently in the spectral domain, working on a phase vocoder.  Since convolution I've just covered the STFT and phase vocoders, but this alone has opened the doors to cross synthesis, spectral filtering, time stretching and pitch shifting.  These are the kinds of reasons I had an interest in audio programming to start off, so it was ridiculously satisfying when I produced my first output crossing some spoken word and music.  The first working output of my programs often takes me by suprise, when I realise what can be achieved with a smallish amount of code and a clever algorithm.  I've chosen to upload different output, involving non-copyrighted sounds.

The sound below is created by a process called "cross synthesis" and is done by taking the Short Term Fourier Transform of two signals (basically a sliding FFT) and taking the spectral magnitudes from one signal and the phase from the other.  The Inverse STFT brings the result back to the time domain, i.e. produces an audio signal.  The effect is kind of like applying the pitches (phase related) of one to the sounds of the other.

I've been coding for a while I guess, and now I've read some really good books on recommended style and approach to programming ("The Practise of Programming, Effective C++), so hopefully this will help me to improve.  The thing with only self-teaching a programming language is while a reliably working program is definitely positive feedback, I haven't really had anyone evaluate my code and constructively criticise it.  This means I spend a lot of time trying to make informed stylistic decisions, basically ask "What would Bjarne Stroustrup or Dennis Ritchie do?".  I still like to keep a style personal to me; programming is unusual in that although it's very technical, it's also very creative and open to interpretation / abuse.  While limitations, hard rules, and standards can be great, especially to an overwhelmed novice, they also create unnecessary workarounds and unnatural, bloated implementations.  So everything is really about finding a balance, which leads to annoying stylistic ambuguities.

A satisfying thing about C++ as opposed to C is it invites you to create classes, templates, namespaces, neatly organising everything into independent compartments.  It's kind of like starting (almost) from scratch, creating all the components and then assembling a crazy code machine.  Sometimes it feels like all the interfaces and extra work creates a large amount of extra code, compared to the quick and dirty way of doing things.  It's possible to create structures and code way more complex than needed, partly because it's pretty fun creating a complicated and cryptic machine.

As I code for the phase vocoder I can't stop questioning whether I've gone a bit overkill with the classes whether I've lost some efficiency hiding buffers away in a struct, but I know if I stop and think too hard I'll just go in circles.  Maybe it would be a good experiment to implement something like a phase vocoder in as few lines as possible, then as hidden and encapsulated as possible, then maybe an implementation in between and compare clarity and efficiency.

No comments:

Post a Comment