Step 3: Audio Processing

This step will work with a Visual Studio program that allows you to do some basic audio processing. This program is designed to support two basic types of audio manipulation: processing and generation. Generation is the creation of audio from scratch.  Processing is the manipulation of existing audio.  We'll start doing some generation and then move to processing.

**Required Step Result**

See the What you hand in section at the end of this document.

Getting the program running

You need to get your own local copy of the AudioProcess program that you'll be working with during the step. The AudioProcess project has everything we'll be using.

Once you have the files copied, open the AudioProcess directory and double-click on AudioProcess solution file. When Visual Studio 2008 is open, compile the program by selecting Build/Build Solution. You will have to have the Microsoft Windows SDK and DirectX SDK installed for this program to compile.  Note that this program was compiled with the UNICODE switch enabled, so all strings will be wide character strings.  Instead of char, used wchar_t or WCHAR and instead of string use wstring. 

This is a multiple document interface program with multiple document types.  CGenerateDoc is the document for generating audio (synthesis) and CProcessDoc is the document for processing audio (where a file is loaded in and modified). 

Generating Your Own Audio

Execute the program. This program uses multiple document interface windows of two types: generators and processors. If you select File/New, the initial window is a generator. You'll see that it has several default parameters, some of which are displayed on the screen.   The Generate menu has options to perform specific generation operations.  Selecting Sine Wave will generate a sine wave at 1000Hz for 10 seconds.  After doing so, you should see a screen that looks like this:

The bottom part of the windows displays the waveforms for any audio you work with.  The left slider is the scale. Moving the slider up causes more audio on the screen at a time.  The lowest setting is 1 pixel per sample.  The lower scroll bar scrolls the waveform on the display.  Only 10 seconds of generated audio is saved for display using this program.  Should you need more, see the member variable m_waveformBuffer and the SetCapacity member function.

The program defaults to 1000 Hz tone generation on each of two channels at 44,100 Hz sample rate. The duration is in seconds. The amplitude is from 0 to 32767, the range of a 16 bit integer. AudioProcess only supports 16 bit samples. Select Generate:Sine Wave.  By default, the tone is played on the speakers. You'll see options under the Generate menu for generation to a file or directly to audio.  Both can be selected.  Experiment with these to see what they do.  Note that the audio generation assumes your program can generate at full speed.  That may not always be the case.  If your program cannot keep up with the audio playback, you may notice gaps in the playback.  Consider switching to release mode when this happens.

To change the parameters for the generator, select Generate:Parameters and change values in the Dialog box. What these are:

Try some various sine wave combinations including some low frequencies and some high frequencies. Let me know what you break in this program. :) Get used to using it.

Aliasing Experiments

Set the parameters to 8000 Hz sampling and 1 channel. You'll only need to change Frequency 1. Now generate and audition the following tones: 1000, 2000, 3000, 3500, 4000, 4500, 5000, 7000, 7500, 8500. What happens as the tones exceed the Nyquist frequency. Based on your knowledge of a sine wave, write a brief description of your observations and your theory as to the cause.

Adding Your Own Generators

I have provided one example generator.  Take a look at the code for the sine generator in AudioGenerateDoc.cpp:

//
// Name :        CAudioGenerateDoc::OnGenerateSinewave() 
// Description : Example procedure that generates a sine wave.
//               The sine wave frequency is set by m_freq1
//

void CAudioGenerateDoc::OnGenerateSinewave() 
{
   // Call to open the generator output
   if(!GenerateBegin())
      return;

   short audio[2];

   for(double time=0.;  time < m_duration;  time += 1. / m_sampleRate)
   {                 
      audio[0] = short(m_amplitude * sin(time * 2 * M_PI * m_freq1));
      audio[1] = short(m_amplitude * sin(time * 2 * M_PI * m_freq1));

      GenerateWriteFrame(audio);

      // The progress control
      GenerateProgress(time / m_duration);
   }

   
   // Call to close the generator output
   GenerateEnd();
}

What you see in this code are the following operations: 

You can add a new menu option to the Generate menu and create your own generation functions like this one.  Be sure to examine the existing procedures so you will know what they do.

Adding harmonics

Notice: you should be able to do the following two options by copying the default generation loop (about 10 lines) and adding 5 lines (including braces). If you find you are doing a lot more, you're doing too much...

Create a new menu option "234" that adds to the basic sine wave generator for each channel the second, third, and fourth harmonics at amplitudes 1/2, 1/3, and 1/4 of the fundamental amplitude. Create a file called 234.wav.

Create a new menu option "357" that adds to the basic sine wave generator for each channel the third, fifth, and seventh harmonics at amplitudes 1/3, 1/5, and 1/7 of the fundamental amplitude. Create a file called 357.wav. Compare the sound of these two files.

All harmonics

Create a generate procedure that generates ALL harmonics up to the Nyquist frequency.  The amplitude of the harmonics should be a/h, where a is the amplitude of the fundamental and h is the harmonic number.  Call the output file all.wav. 

Create a generate procedure that generates all ODD harmonics up to the Nyquist frequency.  The amplitude of the harmonics should be a/h, where a is the amplitude of the fundamental and h is the harmonic number.  Call the output file allodd.wav.

Describe the differences you hear in the sound when you add in the additional harmonics beyond those added in 234 and 357 above. 

Processing Existing Audio

Execute the program. Do File/Open to open a file in the process application.  I've put some sample audio files at  \\samba\cse471\Media\Audio.  You can use other files you find.  The Process menu has options to perform specific processing option.  Selecting Copy will copy the audio file you have opened.  Note that my program will read WAV and MP3 files (most anything that MediaPlayer can read should be readable with this program). 

To change the parameters for the processor, select Process:Parameters and change values in the Dialog box. What these are:

Try some various amplitude values.  Note that my simple solution does not range check the audio.  Be sure to try an amplitude greater than 1.0.  If you keep increasing the amplitude, you'll find a point where it overflows.

Take a look at the code for the Copy function:

void CAudioProcessDoc::OnProcessCopy() 
{
   // Call to open the processing output
   if(!ProcessBegin())
      return;

   short audio[2];

   for(int i=0;  i<SampleFrames();  i++)
   {                 
      ProcessReadFrame(audio);

      audio[0] = short(audio[0] * m_amplitude);
      audio[1] = short(audio[1] * m_amplitude);

      ProcessWriteFrame(audio);

      // The progress control
      if(!ProcessProgress(double(i) / SampleFrames()))
         break;
   }

   
   // Call to close the generator output
   ProcessEnd();
}

What you see in this code are the following operations: 

You can add a new menu option to the Process menu and create your own processing functions like this one.  Be sure to examine the existing procedure so you will know what it does.

Amplitude, Ramps, and Time

Amplitude control is performed by multiplication by a constant. This works just like the volume control on your stereo. If you multiply the samples by 0.5, you are turning the volume down by 1/2. Note that you have to be careful, since the samples are all fixed point right now. Write a new menu option: "Ramp" that fades in the audio you are processing for 2 seconds at the beginning and fades it out for 2 seconds at the end of the selection. To determine the duration of the audio, use the SampleFrames() function to tell how many audio frames are available and divide by the sample rate (available using the SampleRate() function).  Be sure to see the OnProcessCopy() function for an example of processing audio.

Tremolo

Tremolo is a change in the amplitude of audio so the amplitude increases and decreases rapidly.  A 5 Hz tremolo would increase and decrease the amplitude 5 times per second.  The depth of the tremolo is the amount of change in amplitude as a percentage.  A 10% tremolo depth would increase the amplitude by 10% and some times and decrease the amplitude by 10% at other times.  This can be written as an equation:

a = 1 + d * sin(f * 2 * pi * t)   

In this equation, a is the new amplitude. A value of 1.1 means the audio should 1.1 times as loud.  d is the tremolo depth.  A value of 0.1 is a 10% depth.  f is the tremolo frequency in hertz.  Implement a tremolo effect that processes audio adding tremolo with a depth of 20% and a frequency of 4 Hz.

Playing at the wrong speed

Add a menu option Fast that drops every other sample, effectively playing the file at double speed. Likewise, add an option called Slow that plays every sample twice. Suggest a method for speeding up by 10%.  Note:  Dropping every 10th sample will not work; it actually is equivalent to speeding the playback up by 11.1%. I want an method that speeds up exactly by 10%. You cannot change the sample rate.

Backwards Playback

There has been a lot of silliness over the years about "backwards" messages.  See Reverse Speech for examples and a lot of stupid information.  If you dig around a bit, you'll find my favorites: hidden messages from the unconscious of Osama Bin Laden.  Note that these backmasked messages are in English, even if the speaker doesn't know English (but, their subconscious does, right?)  Unfortunately, the audio links seem to be down, now.

Try this experiment with any of the examples on the Reverse Speech site:  Listen to the audio without reading what you are supposed to hear.  In all likelihood you won't hear any messages at all. Note that they make it rather hard to do this on their site for some reason :) Then, listen after reading what you are supposed to hear.  Do you hear it now? 

Write a new option Backwards that will play your audio forward, then in reverse.  Note that you can't read an audio file backwards, so you will have to allocate memory to hold the audio.  Be sure your solution does not leave any memory unused when it is done.

What you hand in

Programming Problems - Due 10-1-09 at 11:59pm

Submit a .zip file with your program solution in it.

Written Problems - Due 10-5-09 at 11:59pm

There are three questions above in blue.  Please submit a written answer to each of these questions.

CSE 471