A common element in audio processing is the delay. Recording studios used to have long pipes in the basement with speakers on one end and microphones on the other. These pipes provided a way to delay sound by taking advantage of the speed of sound in the air. However, these delay lines were prone to distortions due to the speaker and microphone qualities, ambient noise that snuck into the system, and multiple passes of audio through the system.
We can do better. A delay in a digital audio system is simply a queue. If we want to delay one second, we can delay by using a queue equivalent in size to the sample rate per second. Now, you could use an STL list to implement this:
// Initialize
std::list<short> queue;
for(int i=0; i<SAMPLERATE; i++)
queue.push_back(0);
for each audio sample:
sample = ReadSample();
queue.push_back(sample);
WriteSample(*queue.begin());
queue.pop_front();
Now, I don't like this choice for several reasons:
The power tool for digital audio delays is the circular queue. I generally implement these using an STL vector:
std::vector<short> queue;
queue.resize(QUEUESIZE); // Queue size is in samples and is the
// maximum possible delay.
// Resize fills with zeros for us
int wrloc = 0; // Where we are writing in the queue
Now, you are probably used to the idea of also creating a read location counter. This is, after all, how you would create a circular queue, right? Well, we're not going to create one. We'll deal with that another way.
Here's how to push into the queue:
wrloc = (wrloc + 1) % QUEUESIZE; // Keep wrloc inside the queue
queue[wrloc] = sample; // Put into queue
Let delay be an arbitrary delay amount not exceeding our queue length. It will be a floating point value in seconds. Here's how to read from the queue at a specific delay location:
int delaylength = int( (delay * SAMPLERATE) + 0.5 ); // How long in sample frames? int rdloc = (wrloc + QUEUESIZE - delaylength) % QUEUESIZE; // Where to read delayedsample = queue[rdloc];
Note that the delay can change because it's just a parameter.
So, how do we handle stereo? Just make the queue twice as large and be sure it's an even size. Here's an example:
std::vector<short> queue;
queue.resize(QUEUESIZE); // Queue size is in samples and is the
// maximum possible delay * 2.
// Resize fills with zeros for us
int wrloc = 0; // Where we are writing in the queue
Here's how to push into the queue:
wrloc = (wrloc + 2) % QUEUESIZE; // Keep wrloc inside the queue
queue[wrloc] = sampleL; // Put into queue
queue[wrloc+1] = sampleR; // Put into queue
Let delay be an arbitrary delay amount not exceeding our queue length. It will be a floating point value in seconds. Here's how to read from the queue at a specific delay location:
int delaylength = int( (delay * SAMPLERATE) + 0.5 ) * 2; // How long in sample frames? int rdloc = (wrloc + QUEUESIZE - delaylength) % QUEUESIZE; // Where to read delayedsampleL = queue[rdloc++]; delayedsampleR = queue[rdloc];
Here is a complete example of a one second delay implemented in the AudioProcess environment.
void CAudioProcessDoc::OnProcessDelay()
{
// Call to open the processing output
if(!ProcessBegin())
return;
short audio[2];
const int QUEUESIZE = 200000;
const double DELAY = 1.0;
std::vector<short> queue;
queue.resize(QUEUESIZE);
int wrloc = 0;
double time = 0;
for(int i=0; i<SampleFrames(); i++, time += 1./SampleRate())
{
ProcessReadFrame(audio);
wrloc = (wrloc + 2) % QUEUESIZE;
queue[wrloc] = audio[0];
queue[wrloc+1] = audio[1];
int delaylength = int( (DELAY * SampleRate() + 0.5)) * 2;
int rdloc = (wrloc + QUEUESIZE - delaylength) % QUEUESIZE;
audio[0] = audio[0]/2 + queue[rdloc++]/2;
audio[1] = audio[1]/2 + queue[rdloc]/2;
ProcessWriteFrame(audio);
// The progress control
if(!ProcessProgress(double(i) / SampleFrames()))
break;
}
// Call to close the generator output
ProcessEnd();
}
I was able to modify this code to do the flange and the chorus effects in stereo by adding only ONE line of code for each.
Clearly, any delay using this system must be a multiple of the sample duration. This is about 22us for CD audio, so it's not too bad a granularity. If you need finer granularity, you can linearly interpolate between samples or there are more complicated methods involving something called an all pass filter.
A problem to watch out for is ending your generation early. In the example above, generation of audio would end when the input file ended. If I am playing a 10 second audio file, the output duration would be 10 seconds. However, if we have a 1 second delay, we really want the output duration to be longer to accommodate the audio left in the queue. So, we might add a loop above that would continue to play audio for another second after the main loop ended so as to remove any remaining audio from the queue. This is called flushing the delay or flushing the queue. If you don't do it, you'll often have audio that seems to abruptly end, cutting of the ending delays.