Making sounds with analogue electronics – Part 5: Using analogue synthesis
[Part 1 briefly reviews the differences between analogue and digital synthesis, and discusses "one of the major innovations in the development of the synthesizer" – voltage control. Part 2 begins a look at subtractive synthesis with a discussion of VCOs, waveforms, harmonic content, and filters. Part 3 discusses envelopes – the overall ‘shape’ of the volume of a sound, plotted against time. Part 4 looks at amplifiers as well as other modifiers, including LFOs, envelope followers, waveshapers, and modulation.]
3.3.9 Using analogue synthesis
Learning how to make the best use of the available facilities provided by an analogue synthesizer requires time and effort. Although there are a number of ‘standard’ configurations of VCO, VCF, VCA and envelopes, the key to making the most of an analogue synthesizer is understanding how the separate parts work: both in isolation and in combination. If copies can be located, then Roland (1978, 1979) and De Furia (1986) are excellent references for further reading on this subject.
As a brief introduction to some of the techniques of using an analogue synthesizer, the remainder of this section shows how a subtractive analogue synthesizer can be an excellent learning tool for exploring some of the principles of audio and acoustics. Here are some of the demonstrations which can be carried out using a subtractive synthesizer.
Harmonic content of waveforms
The harmonic content of different waveshapes can be audibly demonstrated by using a low-pass VCF with high resonance (set just below self-oscillation) or a narrow band-pass filter. Each VCO waveform is connected to the filter input, and the filter cut-off frequency is slowly increased from zero to maximum (Figure 3.3.37).
FIGURE 3.3.37 By varying the cut-off frequency of a resonant low-pass filter, the harmonic content of a waveform can be heard. As each of the harmonics which are present in the spectrum pass through the peak of the filter, they will be clearly heard. The frequency of the harmonic can be determined by noting the frequency of the filter when the harmonic is heard.
As the resonant peak passes the fundamental, the filter output will be a sine wave at that frequency. As the cut-off frequency is increased further, the fundamental sine wave will disappear, and the next harmonic will be heard as the cut-off frequency matches the frequency of the harmonic. The audible result is a series of sine waves, whose frequency matches the frequencies of the harmonics.
If noise is passed through the filter, then the output will be sine waves whose frequencies will be within the pass-band of the resonant peak, and whose levels will change randomly. The audible result is rather like whistling.
Harmonic content of pulses
The harmonic content of different pulse widths of pulse waveforms can be demonstrated by listening to the pulse waveform and changing the pulse width manually (Figure 3.3.38). At a pulse width of 50%, the sound will be noticeably hollow in timbre: this is a square wave. The square wave position can be heard because the second harmonic, which is one octave above the fundamental, will disappear.
FIGURE 3.3.38 The harmonic content of a square wave and a rectangular wave is different, especially the even harmonics. The second harmonic is not present in a square wave and yet can be clearly heard in a rectangular waveform. This can be used to produce square waves from a VCO which provides control over the width of the pulse. By adjusting the pulse width control and listening for the disappearance of the second harmonic, a square wave can be produced.
Using the resonant filter technique described in the previous example, individual harmonics can be examined – tuning the filter to the harmonic which disappears for a square wave can be used to emphasize this effect. As the pulse width is reduced, the timbre will then become brighter and brighter, and with very small pulse widths, the sound may disappear entirely. (This is a consequence of the design of the VCO circuitry, and not an acoustic effect!) Conversely, increasing the pulse width from 50% produces the same changes in the timbre and, again at very large pulse widths, may result in the loss of the sound.
Filtering
Many resonance and ringing filter effects can be demonstrated by connecting a percussive envelope to a VCF CV input and turning up the resonance. Just below self-oscillation, the filter can be made to oscillate for a short time by using the envelope to trigger the oscillation (Figure 3.3.39).
FIGURE 3.3.39 If a strongly resonant filter is ‘triggered’ by a brief pulse of noise or an envelope pulse, then it can ‘ring’ producing a decaying oscillation at the cut-off or peak frequency.
This ‘ringing oscillator’ is the basis of the designs for many drum machine sounds in the 1970s (see Section 3.3.5 and Figure 3.3.7).
White noise filtered by a resonant low-pass filter changes from a hiss to a rumble as the cut-off frequency is reduced, because the filter is acting as a narrow bandwidth band-pass filter. With very narrow bandwidths, the noise then begins to produce a sense of pitch; and by connecting the keyboard voltage to the VCF so that it tracks the keyboard, these ‘pitched noise’ sounds can then be played with the keyboard. Keen experimenters might like to compare this with an alternative approach with audibly similar results: modulating the frequency of a VCO with noise.
Beats
Beats occur when two VCOs or audio signals are detuned relative to each other. The interference between the two signals produces a cyclic variation in the overall level as they combine or cancel each other out repeatedly (Figure 3.3.40). The time between the cancellations is related to the difference in frequency between the two audio signals or VCOs. Using two VCOs with a beat frequency of 1 Hz or less produces a ‘lively’, ‘rich’ and interesting sound.
FIGURE 3.3.40 Beats can be demonstrated by mixing together the outputs of two VCOs which have slightly different frequencies. The two waveforms will cyclically add together or subtract, and so produce an output that varies in level. The audible effect is an interesting ‘chorus’ type of sound for frequency differences of less than 2 Hz and vibrato for 2–20 Hz.
PWM uses an LFO to cyclically change the width of a pulse waveform from a single VCO. The result has many of the audible characteristics of two VCOs beating together.
Vibrato versus tremolo
- Vibrato is FM: The frequency of the audio signal is changed. Using an LFO to modulate the frequency of a VCO produces vibrato.
- Tremolo is AM: The level of the audio signal is changed. Using an LFO to modulate the level of an audio signal using a VCA produces tremolo.
Modulation summary and the cyclic variations of vibrato and tremolo are shown in Table 3.3.2 and Figure 3.3.41, respectively.
Table 3.3.2 Modulation Summary
FIGURE 3.3.41 Vibrato is a cyclic variation in the frequency of a sound, whilst tremolo is a cyclic variation in the level of a sound.
3.4 Additive synthesis
Subtractive synthesis starts out with a harmonically rich sound and ‘subtracts’ some of the harmonics, whereas additive synthesis does almost the exact opposite. It adds together sine waves of different frequencies to produce the final sound. Because large numbers of parameters need to be controlled simultaneously, the user interface is usually much more complex than that of a subtractive synthesizer.
3.4.1 Theory: additive synthesis
Additive synthesis is based on the work produced by Fourier, a French mathematician from the nineteenth century. In 1807, Fourier showed that the shape of any repetitive waveform could be reproduced by adding together simpler waveforms, or alternatively, that any periodic waveform could be described by specifying the frequency and amplitude of a series of sine waves.
The restriction that the waveshape must repeat is imposed to keep the mathematics manageable. Without the restriction it is still possible to convert any waveform into a series of sine waves, but since the waveform is not constant, the sine waves that make it up are not constant either.
One useful analogy is to think of trying to describe writing to someone, who has never seen it, over the telephone. You might start out by describing how the words are broken up into letters and these letters are made up out of lines, dots and curves. This works perfectly well as long as the words you might try to describe stay fixed, but if they change, then you would have to keep updating your description. You could still convey the information about the shape of the letters that make up the words, but you would have to provide lots more detailed description as the letters change.
The simplest example of synthesizing a waveform using Fourier synthesis is a sine wave. A sine wave is made up of just one sine wave, at the same frequency! In terms of harmonics, a sine wave contains just one frequency component, at the repetition rate of the fundamental.
More complicated waveshapes can be made by adding additional sine waves. The simplest method involves using simple integer multiples of the fundamental frequency. So, if the fundamental is denoted by f, then the additional frequencies will be 2f, 3f, 4f, etc. These are the frequencies that occur in some of the basic waveshapes – sawtooth, square, etc., and are known as harmonics. Because the numbering of the harmonics is based around their position above the fundamental or first harmonic, with a frequency of f, then the second harmonic has a frequency of 2f. The second harmonic is also sometimes called the first overtone (Table 3.4.1).
Table 3.4.1 Harmonics, Frequencies and Overtones
3.4.2 Harmonic synthesis
So far, additive synthesis seems to be based around producing a specific waveform from a series of sine waves. In practice, the ‘shape’ of a waveform is not a good guide to its harmonic content, since minor changes to the shape can produce large changes in the harmonic content.
Conversely, simple changes of phase for the harmonics can produce major changes in the shape of the waveform. In fact, although the human ear is mainly concerned with the harmonic content, the relative phase of the harmonics can be very important at low frequencies. For frequencies above 440 Hz, you can change the phase of a harmonic and thus alter the resulting shape of the waveform, but the basic timbre will sound the same. Control over phase is thus useful under some circumstances and is found in some additive synthesizers.
The harmonic content of waveshapes is a useful starting point for examining this relationship between shape and perception. Mathematically and harmonically, the ‘simplest’ waveshape is the sine wave. Sine waves sound clean and pure, and perhaps even a little bit boring. Adding in small amounts of oddnumbered harmonics produces a triangular waveshape, which has enough harmonic content to stop it sounding quite as pure as the sine wave (Figure 3.4.1).
FIGURE 3.4.1 (i) A triangle waveform constructed from six sine wave harmonics is very different from a sine wave, even though the fundamental is by far the strongest component. (ii) A combination of equal amounts of the first 12 harmonics produces a waveform which looks (and sounds) like a type of pulse waveshape.
A square wave contains only odd harmonics. It has a characteristic ‘hollow’ sound, and the absence of the second harmonic is particularly noticeable if a square wave is compared with a sawtooth wave (Figure 3.4.2).
FIGURE 3.4.2 (i) A square waveform constructed from six sine wave harmonics has a close approximation to the ideal waveshape. (ii) Changing the phase of the third harmonic radically alters the shape of the waveform.
A square wave that has been produced with a phase change in the second harmonic no longer looks like a ‘square’ wave, and yet the harmonic content is the same (Figure 3.4.3).
A sawtooth wave contains both odd and even harmonics. It sounds bright, although many pulse and ‘super-sawtooth’ waveshapes can contain greater levels of harmonics. Again, a sawtooth wave with a phase change in the second harmonic does not look like a sawtooth, although it still sounds like one to the ear (Figure 3.4.3).
FIGURE 3.4.3 (i) A sawtooth waveform constructed from 12 sine wave harmonics has a close approximation to the ideal waveshape. (ii) Changing the phase of the second harmonic radically alters the shape of the waveform.
Pulse waves contain more and more harmonics as the pulse width narrows (or widens) from square. A 10% pulse has the same spectrum as a 90% pulse and it also sounds the same to the ear. One special case is the square wave, where the even harmonics are missing completely. Pulse widths of anything other than 50% include the second harmonic, and this can usually be clearly heard as the pulse width is varied away from the 50% value.
Finally, there is the ‘even harmonic’ wave. If a sawtooth contains both odd and even harmonics and a square wave contains just the odd harmonics, then what does a wave containing just the even harmonics look like? Actually, it is just another square wave, but one octave higher in pitch, and with a fundamental frequency of 2f!
In practice, adding together sine waves produces waveforms that have some of the characteristics of the mathematically perfect ideal waveforms, but not all. Producing square edges on a square wave would require large numbers of harmonics – an infinite number for a ‘perfect’ square wave. Using just a few harmonics can produce waveforms that have enough of the harmonic content to produce the correct type of timbre, even though the shape of the waveform may not be exactly as expected.
3.4.3 Harmonic analysis
In order to produce useful timbres, an additive synthesizer user really needs to know about the harmonic content of real instruments, rather than mathematically derived waveforms. The main method of determining this information is Fourier analysis, which reverses the concept of making any waveform out of sine waves and uses the idea that any waveform can be split into a series of sine waves.
The basic concept behind Fourier analysis is quite simple, although the practical implementation is usually very complicated. If an audio signal is passed through a very narrow band-pass filter that sweeps through the audio range, then the output of the filter will indicate the level of each band of frequencies which are present in the signal (Figure 3.4.4). The width of this bandpass filter determines how accurate the analysis of the frequency content will be: if it is 100 Hz wide, then the output can only be used to a resolution of 100 Hz, whereas if the band-pass filter has a 1-Hz bandwidth, then it will be able to indicate individual frequencies to a resolution of 1 Hz.
FIGURE 3.4.4 Sweeping the center frequency of a narrow band-pass filter can convert an audio signal into a spectrum: from the time domain to the frequency domain.
For simple musical sounds that contain mostly harmonics of the fundamental frequency, the resolution required for Fourier analysis is not very high. The more complex the sound, the higher the required resolution. For sounds that have a simple structure consisting of a fundamental and harmonics, a rough ‘rule of thumb’ is to make the bandwidth of the filter less than the fundamental frequency, since the harmonics will be spaced at frequency intervals of the fundamental frequency.
Having 1-Hz resolution in order to discover that there are five harmonics spaced at 1-kHz intervals is extravagant. Smaller bandwidths require more complicated filters, and this can increase the cost, size and processing time, depending on how the filters are implemented. Fourier analysis can be achieved using analogue filters, but it is frequently carried out by using digital technology (see Section 5.8).
Numbers of harmonics
How many separate sine waves are needed in an additive synthesizer? Supposing that the lowest fundamental frequency which will be required to be produced is a low A at 55 Hz, then the harmonics will be at 110, 165, 220, 275, 330, 385, 440 Hz,… The 32nd harmonic will be at 1760 Hz and the 64th harmonic at 3520 Hz.
An A at 440 Hz has a 45th harmonic of 19,800 Hz. Most additive synthesizers seem to use between 32 and 64 harmonics (Table 3.4.2).
Table 3.4.2 Additive Frequencies and Harmonics
Harmonic and inharmonic content
Real-world sounds are not usually deterministic: they do not contain just simple harmonics of the fundamental frequency. Instead, they also have additional frequencies that are not simple integer multiples of the fundamental frequency.
The following are several types of these unpredictable ‘inharmonic’ frequencies:
- noise
- beat frequencies
- sidebands
- inharmonics.
Noise has, by definition, no harmonic structure, although it may be present only in specific parts of the spectrum: colored noise. So any noise which is present in a sound will appear as random additional frequencies within those bands, and whose level and phase are also random.
Beat frequencies arise when the harmonics in a sound are not perfectly in tune with each other. ‘Perfect’ waveshapes are always assumed to have harmonics at exact multiples of the fundamental, whereas this is not always the case in real-world sounds. If a harmonic is slightly detuned from its mathematically ‘correct’ position, then additional harmonics may be produced at the beat frequency, so if a harmonic is 1 Hz too high in pitch relative to the fundamental, then a frequency of 1 Hz will be present in the spectrum.
Sidebands occur when the frequency stability of a harmonic is imperfect, or when the sound itself is frequency modulated. Both cases result in pairs of frequencies which mirror around the ‘ideal’ frequency. So a 1-kHz sine wave which is frequency modulated with a few hertz will have a spectrum that contains frequencies on either side of 1 kHz, and the exact content will depend on the depth of modulation and its frequency. See Section 3.5.1 for more details.
Inharmonics are additional frequencies that are structured in some way, and so are not noise, but which do not have the simple integer multiple relationship with the fundamental frequency. Timbres that contain inharmonics typically sound like a ‘bell’ or ‘gong’.
Many additive synthesizers only attempt to produce the harmonic frequencies, with perhaps a simple noise generator, as well. This deterministic approach limits the range of sounds which are possible, since it ignores many stochastic, probabilistic or random elements which make up real-world sounds.
3.4.4 Envelopes
The control of the level of each harmonic over time uses EGs and VCAs. Ideally, one EG and one VCA should be provided for each harmonic. This would mean that the overall envelope of the final sound was the result of adding together the individual envelopes for each of the harmonics, and so there would be no overall control over the envelope of the complete sound. Adding an overall EG and VCA to the sum of the individual harmonics allows quick modifications to be made to the final output (Figure 3.4.5).
FIGURE 3.4.5 Individual envelopes are used to control the harmonics, but an overall envelope allows easy control over the whole sound which is produced.
In order to minimize the number of controls and the complexity, the EGs need to be as simple as possible without compromising the flexibility. Delayed ADR (DADR) envelopes are amongst the easiest of EGs to implement in discrete analogue circuitry, since the gate signal can be used to control a simple capacitor charge and discharge circuit to produce the ADR envelope voltage. DADR envelopes also require only four controls (delay time, attack time, decay time and release time), whereas a DADSR would require five controls and more complex circuitry. If integrated circuit (IC) EGs are used, then the ADSR envelope would probably be used, since most custom synthesizer chips provide ADSR functionality.
Control grouping and ganging With large numbers of harmonics, having separate envelopes for each harmonic can become very unwieldy and awkward to control. The ability to assign a smaller number of envelopes to harmonics can reduce the complexity of an additive synthesizer considerably. This is only effective if the envelopes of groups of harmonics are similar enough to allow a ‘common’ envelope to be determined. Similarly, ganging together controls for the level of groups of harmonics can make it easy to make rapid changes to timbres – altering individual harmonics can be very time consuming. Simple groupings such as ‘all of the odd’ or ‘all of the even’ harmonics, can be useful starting points for this technique.
A more advanced use for grouping involves using keyboard voltages to give pitch-dependent envelope controls. This can be used to create the effect of fixed resonances or ‘formants’ at specific frequencies.
Filter simulation/emulation
Filters modify the harmonic content of a sound. In the case of an additive synthesizer, there are two ways that this can be carried out: with a filter or with a filter emulation. As with the overall envelope control mentioned earlier, there are advantages to having a single control for the combined harmonics, and a VCF could be added just before the VCA. Such a filter would only provide crude filtering of the sound, in exactly the same way as in subtractive synthesis.
Filter emulation uses the individual EGs for the harmonics to ‘synthesize’ a filter by altering the envelopes. For example, if the envelopes of higher harmonics are set to have progressively shorter decay times, then when a note is played, the high harmonics will decay the first (Figure 3.4.6). This has an audible effect which is very similar to a low-pass filter being controlled by a decaying envelope. The difference is that the ‘filter’ is the result of the action of all the envelopes, rather than one envelope. Consequently, individual envelopes can be changed, which then allow control over harmonics that would not be possible using a single VCF.
FIGURE 3.4.6 By using different envelopes for each harmonic, a filter can be ‘synthesized’. This example shows the equivalent of a low-pass filter being produced by a number of different decaying envelopes.
As with the envelope control ganging and grouping, similar facilities can be used to make filter emulation easier to use, although the implementation of this is much easier in a fully digital additive instrument.
3.4.5 Practical problems
Analogue additive synthesis suffers from a number of design difficulties. Generating a large number of stable, high-purity sine waves simultaneously can be very complex, especially if they are not harmonically related. Providing sufficient controls for the large number of available parameters is also a problem.
Depending on the complexity of the design, an additive synthesizer might have the following parameters repeated for each harmonic:
- frequency (fixed harmonic or variable inharmonic)
- phase
- level
- envelope (DADR, DADSR or multi-segment – four or more controls).
For a 32-harmonic additive synthesizer, these eight parameters give a total of just over 250 separate controls, ignoring any additional controls for ganging and filter emulation. Although it is possible to assemble an additive synthesizer using analogue design techniques, practical realizations of additive synthesizers have tended to be digital in nature, where the generation and control problems are much more easily solved.
Spectrum plots
The subtractive and additive sections in this chapter have both shown plots of the harmonic content of waveforms, showing a frequency axis plotted against level. This ‘harmonic content’ graph is called a spectrum, and it shows the relative levels of the frequencies in an audio signal. Whereas a waveform is a way of showing the shape of a waveform as its value changes with time, a spectrum is a way of showing the harmonic content of a sound. The shape of a waveform is not a very good indication of the harmonic content of a sound, whereas a spectrum is – by definition.
Spectra (the plural of the Latin-derived word ‘spectrum’) are not very good at showing any changes in the harmonic content of a sound – in much the same way that a single cycle of a PWM waveform does not convey the way that the width of the pulse is changing over time. To show changes in spectra, a ‘waterfall’ or ‘mountain’ graph is used, which effectively ‘stacks’ several spectra together. The resulting 3D-like representation can be used to show how the frequency content changes with time (Figure 3.4.7).
FIGURE 3.4.7 A spectrum is a plot of frequency against level. It thus shows the harmonic content of an audio signal. In most of the examples in this book, the horizontal axis is normally shown with harmonic numbers instead of frequencies – the 55-Hz sine wave spectrum shows the correspondence with frequency. When a spectrum changes with time, then a ‘mountain’ graph may be used to show the changes in the shape.
Coming up in Part 6: Other methods of analogue synthesis.
Printed with permission from Focal Press, a division of Elsevier. Copyright 2009. "Sound Synthesis and Sampling" by Martin Russ. For more information about this title and other similar books, please visit www.elsevierdirect.com.