Digital Signal Processing for Audio Applications: Volume 1 - Formulae
()
About this ebook
In the summer of 2003 we began designing multi-track recording and mixing software – Orinj at RecordingBlogs.com – a software application that will take digitally recorded audio tracks and will mix them into a complete song with all the needed audio production effects. Manipulating digital sound, as it turned out, was not easy.
Related to Digital Signal Processing for Audio Applications
Related ebooks
Digital Signal Processing for Audio Applications: Volume 2 - Code Rating: 5 out of 5 stars5/5Back to Basics Audio Rating: 3 out of 5 stars3/5Acoustics: Sound Fields and Transducers Rating: 4 out of 5 stars4/5Digital Signal Processing 101: Everything You Need to Know to Get Started Rating: 3 out of 5 stars3/5Transitions from Digital Communications to Quantum Communications: Concepts and Prospects Rating: 0 out of 5 stars0 ratingsMusic Engineering Rating: 2 out of 5 stars2/5Desktop Mastering Rating: 0 out of 5 stars0 ratingsAudio Engineering: Know It All Rating: 5 out of 5 stars5/5TV & Video Engineer's Reference Book Rating: 3 out of 5 stars3/5Applied Digital Signal Processing and Applications Rating: 0 out of 5 stars0 ratingsAnalog Dialogue, Volume 47, Number 1 Rating: 0 out of 5 stars0 ratingsRadio Propagation Measurement and Channel Modelling Rating: 0 out of 5 stars0 ratingsDigital Filters Rating: 4 out of 5 stars4/5Sound Design and Mixing in Reason Rating: 2 out of 5 stars2/5Introduction to Acoustics Rating: 0 out of 5 stars0 ratingsDigital Audio Signal Processing Rating: 0 out of 5 stars0 ratingsDigital Signal Processing: A Practitioner's Approach Rating: 0 out of 5 stars0 ratingsMusic, Physics and Engineering Rating: 4 out of 5 stars4/5Audio Electronics Rating: 4 out of 5 stars4/5Physical and Applied Acoustics: An Introduction Rating: 0 out of 5 stars0 ratingsAcoustic Wave Sensors: Theory, Design and Physico-Chemical Applications Rating: 0 out of 5 stars0 ratingsThe Sound of Science: A Beginner's Guide to Acoustics Rating: 0 out of 5 stars0 ratingsAn Introduction to Hypergeometric, Supertrigonometric, and Superhyperbolic Functions Rating: 0 out of 5 stars0 ratingsSound Foundations Audio Engineering Guide: 20-20 Audio Engineering Reference Guide Late 2019 TROONATNOOR Edition Rating: 0 out of 5 stars0 ratingsPhysical Foundations of Technical Acoustics Rating: 0 out of 5 stars0 ratingsPhysics and Music: The Science of Musical Sound Rating: 5 out of 5 stars5/5Short-Wavelength Magnetic Recording: New Methods and Analyses Rating: 0 out of 5 stars0 ratingsIntroduction to Audio Analysis: A MATLAB® Approach Rating: 5 out of 5 stars5/5Physical Acoustics V18: Principles and Methods Rating: 0 out of 5 stars0 ratingsAdvanced Digital Signal Processing and Noise Reduction Rating: 0 out of 5 stars0 ratings
Technology & Engineering For You
The Art of War Rating: 4 out of 5 stars4/5The Systems Thinker: Essential Thinking Skills For Solving Problems, Managing Chaos, Rating: 4 out of 5 stars4/5Ultralearning: Master Hard Skills, Outsmart the Competition, and Accelerate Your Career Rating: 4 out of 5 stars4/5Longitude: The True Story of a Lone Genius Who Solved the Greatest Scientific Problem of His Time Rating: 4 out of 5 stars4/580/20 Principle: The Secret to Working Less and Making More Rating: 5 out of 5 stars5/5The Big Book of Hacks: 264 Amazing DIY Tech Projects Rating: 4 out of 5 stars4/5A Night to Remember: The Sinking of the Titanic Rating: 4 out of 5 stars4/5The Art of War Rating: 4 out of 5 stars4/5Vanderbilt: The Rise and Fall of an American Dynasty Rating: 4 out of 5 stars4/5The Right Stuff Rating: 4 out of 5 stars4/5The 48 Laws of Power in Practice: The 3 Most Powerful Laws & The 4 Indispensable Power Principles Rating: 5 out of 5 stars5/5The Big Book of Maker Skills: Tools & Techniques for Building Great Tech Projects Rating: 4 out of 5 stars4/5Death in Mud Lick: A Coal Country Fight against the Drug Companies That Delivered the Opioid Epidemic Rating: 4 out of 5 stars4/5How to Disappear and Live Off the Grid: A CIA Insider's Guide Rating: 0 out of 5 stars0 ratingsThe Fast Track to Your Technician Class Ham Radio License: For Exams July 1, 2022 - June 30, 2026 Rating: 5 out of 5 stars5/5The Invisible Rainbow: A History of Electricity and Life Rating: 4 out of 5 stars4/5The CIA Lockpicking Manual Rating: 5 out of 5 stars5/5Understanding Media: The Extensions of Man Rating: 4 out of 5 stars4/5Logic Pro X For Dummies Rating: 0 out of 5 stars0 ratingsElectrical Engineering 101: Everything You Should Have Learned in School...but Probably Didn't Rating: 5 out of 5 stars5/5Summary of Nicolas Cole's The Art and Business of Online Writing Rating: 4 out of 5 stars4/5Selfie: How We Became So Self-Obsessed and What It's Doing to Us Rating: 4 out of 5 stars4/5The Total Motorcycling Manual: 291 Essential Skills Rating: 5 out of 5 stars5/5My Inventions: The Autobiography of Nikola Tesla Rating: 4 out of 5 stars4/5Artificial Intelligence: A Guide for Thinking Humans Rating: 4 out of 5 stars4/5Rust: The Longest War Rating: 4 out of 5 stars4/5
Related categories
Reviews for Digital Signal Processing for Audio Applications
0 ratings0 reviews
Book preview
Digital Signal Processing for Audio Applications - Anton R Kamenov
Chapter 1. Introduction
In practice, sound is a complex multitude of waves with various properties. When this sound is recorded digitally, however, it becomes simply a collection of numbers. The properties of the many waves remain hidden within the digitized data. In the world of audio then, the task of digital signal processing – the processing of digitized audio signals – is to manipulate these digital data with little knowledge of their precise properties.
To be sure, contemporary music is likely digital. Since a digital audio signal is simply a set of numbers, there may be operations that are mathematical in nature and that allow us to modify these numbers to the needed end. Contemporary music mixing and mastering relies heavily on such mathematical manipulations, whether for equalizing, reverberating, introducing distortion, or otherwise producing the recorded sound.
The purpose of this book is to present a simple, structured approach to understanding how digitally recorded sound can be manipulated. While the theory of digital signal processing may seem complex, the corresponding mathematical computations are not. Digital signal processing can be presented in a simple and transparent manner, one that it is easy for hobbyists to understand and implement, and one that requires little mathematical or engineering background. After all, when properly explained, much of the practical applications of this mathematics reduce to simple algebra and some trigonometric identities.
Digital signal processing, or DSP, is the science of manipulating digital signals. It is not specific to music, but is used anywhere where digital signals are present, from cell phones and television transmissions, to digital images and digital blood pressure monitors. Whenever a signal is recorded digitally, DSP manipulates the signal using only the information provided in the digital signal data. It is the fundamental task of DSP to manipulate digital signals using nothing more than what is contained in the digital representation of the signal.
DSP for audio is of specific interest for two reasons. First, manipulating digital audio data is relatively simple. Mathematically, these data are a one-dimensional array of information – a function of time. Second, audio production uses a wide variety of manipulations, including simple ones such as combining two audio signals during mixing, more complex ones such as changing the frequency content of audio data during equalizing, or esoteric ones such as purposefully introducing errors in the data during dithering. Audio data is the ideal candidate for an intuitive description of the full power of DSP.
Chapter 2. Simple waves in continuous time
To manipulate signals mathematically, we must first give them a useful mathematical representation. The purpose of this and the following several chapters is to show that complex practical signals, including sound, can be represented as collections of simple sine and cosine waves with various properties.
We begin by examining the properties of simple sine and cosine waves. Consider the following function of time t.
Equation 2.1
Equation 2.1The cosine wave x(t) is a continuous function that oscillates up and down as the time t increases in the realm of real numbers as in figure 1.
As simple as this function is, it has several important properties. It peaks at 1. Its value at its trough is -1. It completes one of its cycles in a period of time equal to 2π. It starts with cos(0) = 1 at t = 0, is positive for 0 < t < π/2, zero at t = π/2, negative for π/2 < t < 3π/2, and 1 again at t = 2π, since cos(2π) = 1.
Figure 1. A simple cosine wave
A simple cosine waveThe function cos(t) starts with 1 at t = 0, has peak value of 1, and completes a cycle in a period of 2π.
2.1. Initial phase
Not all waves peak at t = 0. Suppose that instead of starting a cycle at t = 0, the wave begins with a peak at t = τ. Such a wave would be described by the formula
Equation 2.2
Equation 2.2When t = τ, we have x(τ) = cos(τ – τ) = cos(0) = 1 and so this wave begins its cycle with a peak at t = τ.
The quantity τ is the wave's initial phase. The waves cos(t) and cos(t – τ) have the same peak values and cycles of the same length, but have different initial phase. The wave cos(t) has initial phase of 0.
Figure 2 compares three waves – ones with τ = 0, τ = -1.2, and τ = 3. Larger values of τ shift the wave to the right and smaller (or larger negative) values of τ shift the wave to the left by the amount of τ.
Figure 2. Simple waves with different initial phase
Simple waves with different initial phaseThese three simple waves have the same cycle lengths and peak values, but have different initial phase: 0, -1.2, and 3.
Note that a wave with initial phase of τ = 2π would be indistinguishable from the original wave cos(t). While the initial phase of those two waves is different – 0 and 2π – the change in the initial phase just happens to be the same as the length of the wave's cycle and the two waves coincide. Note also that a wave with the phase τ = π is the same as the wave cos(t), but inverted as in figure 3. This happens because the phase τ = π is exactly one-half of the length of the wave's cycle. These two waves are said to have an inverted phase or inverted polarity, although it would be proper to say simply that the initial phase of the two waves differ by half of their cycle.
Figure 3. Two waves with inverted phase
Two waves with inverted phaseWhen the initial phase of two waves with the same cycle lengths differs by ½ of their cycle, these waves are said to have inverted phase.
2.2. Peak amplitude
If, in addition to changing the initial phase, we wanted to change the value of the wave at its peak, we could use the formula that follows.
Equation 2.3
Equation 2.3The quantity A is the value of the peak of the wave and is called the wave's peak amplitude. Larger values for A result in waves with larger peaks and troughs and smaller values for A result in smaller peaks and troughs. Figure 4 compares the waves cos(t), cos(t + 1.2), and 0.6 cos(t + 1.2).
The waves cos(t) and cos(t + 1.2) have the same peak amplitude, but different initial phase. The waves cos(t + 1.2) and 0.6 cos(t + 1.2) have the same initial phase, but different peak amplitude. The peak amplitude of 0.6 cos(t + 1.2) is exactly 0.6.
Figure 4. Waves with different phase and amplitude
Waves with different phase and amplitudecos(t) and cos(t + 1.2) have the same peak amplitude, but different initial phase. cos(t + 1.2) and 0.6 cos(t + 1.2) have the same initial phase, but different peak amplitude.
2.3. Frequency
Since all waves discussed above complete a cycle in a period of time equal to 2π, these same waves will complete (1/2π) cycles in a period of time equal to 1. This means, that these waves have the frequency (1/2π). If we want a wave with a cycle of 1, then we should traverse time 2π times faster. If we want a wave with a cycle of ½, then we should traverse time 2π / 2 times faster. In general, if we want a cycle of 1/f, then we should traverse time 2π f times faster. The formula
Equation 2.4
Equation 2.4describes a wave that has initial phase of τ, peak amplitude of A, and a cycle of 1/f or, equivalently, a frequency of f. For example, when f = 3 cycles per unit of time the wave above will complete a cycle in a period of time equal to 1/3. At t = 1/3 the wave will become A cos(2π f τ) and will have the same value as at t = 0.
Figure 5 shows three waves with different frequencies – cos(t), cos(0.5 t), and 0.4 cos(2π t). The frequency of cos(t) is (1/2π) cycles per unit of time. The wave cos(0.5 t) has a frequency of (1/4π) cycles per unit of time. The wave 0.4 cos(2π t) has a frequency of 1 cycle per unit of time. A larger f implies cycles of smaller length and a smaller f implies cycles of larger length.
Figure 5. Waves with different amplitude and frequency
Waves with different amplitude and frequencycos(t) and cos(0.5 t) have the same amplitude, but different frequencies. cos(t) and 0.4 cos(2π t) have different amplitude and different frequency.
The wave
Equation 2.5
Equation 2.5has initial phase equal to τ units of time, peak amplitude equal to A, frequency equal to f cycles per unit of time, and length of cycle equal to 1/f units of time. The three properties – amplitude, phase, and frequency (or length of cycle) – fully define a simple cosine wave as a function of time.
It would occasionally be beneficial to consider the initial phase of the wave not in terms of units of time, but as a portion of the length of the wave's cycle. We would then use the following formula
Equation 2.6
Equation 2.6This formula normalizes the initial phase with respect to the frequency. Since this wave has a cycle of 1/f units of time, when θ = 2τ, then the wave is offset by exactly 1/f, or one cycle, and that is independent of the frequency. For example, when θ = τ, the wave is offset by 1/(2 f), or one-half of its cycle, independent of the frequency.
The wave
Equation 2.7
Equation 2.7has initial phase of θ / (2π f) units of time or θ / 2π portions of its cycle, peak amplitude equal to A, frequency equal to f cycles per unit of time, and length of the cycle equal to 1/f units of time.
Frequencies above are measured in cycles per unit of time. Typically, time is measured in seconds. Frequencies then are measured in cycles per second or Hertz (Hz). 1 Hz is 1 cycle per second.
Chapter 3. Simple waves in discrete time
Simple waves, when working in discrete time, are not defined for the whole realm of real numbers, but only at specific points in time.
3.1. Sampling
The process of taking the values of a continuous function at specific points in time is called sampling. If we sample the wave cos(t) at t = 0, 0.1, 0.2, and so on, we would record the values cos(0) = 1, cos(0.1) = 0.995, cos(0.2) = 0.980, etc. These values are shown on figure 6 for a short period of time.
Rather than counting time t, we can count samples, which we will denote with k, k = 0, 1, 2, and so on. If we sample at every 0.1 seconds, the value of cos(t) at sample k and time tk = 0.1 k would be cos(0.1 k).
Figure 6. A sampled simple wave
A sampled simple waveSampling of the wave cos(t) once every tenth of a second.
We note two important properties of this specific type of sampling. First, the sampling above is done at uniform intervals. That is, the time distance between any two adjacent samples is the same. In the example above, this distance is 0.1 seconds. Second, sampling is recorded with comparable amplitudes. In this example, we record the value of the function as it is at every sample and do not scale or otherwise modify the amplitude at each sample by some different scale. This specific type of sampling is called pulse code modulation or simply PCM.
Suppose that we sample at periods of time equal to T seconds. The quantity T is called the sample time. We would then sample 1/T times during one second. The number of times that we sample during one second (or, in general, one unit of time) is called the sampling rate or the sampling frequency. We will denote the sampling frequency by fs, fs = 1/T. In the example above, the sampling frequency is 10 Hz, as we take ten samples in every second. Finally, the set of values that the amplitude can take is called the sampling resolution. The sampling resolution in the example above is the realm of real numbers.
Fact: The sampling frequency in CD audio is 44,100 Hz or 44.1 KHz. This means that music in CD audio is sampled and recorded digitally with 44,100 samples per second and the sample time is 1/44,100 seconds. Sampled values are recorded using 16-bit signed integers. Since 1 bit is used to represent the sign of the value (positive or negative), the remaining 15 bits are used to represent the amplitude. The amplitude can then take a finite number of values, namely 2¹⁵ = 32768 positive and as many negative values. CD audio is thus said to have 16-bit sampling resolution. The term bit resolution is also used to mean sampling resolution. Thus, CD audio is said to have a sampling rate of 44.1 KHz and a bit resolution of 16.
3.2. Discrete simple waves
Suppose that we sample the simple wave A cos(2π f (t – τ)) with the sampling frequency fs. The sample time is T = 1/fs. The times at which a sample is taken are t = 0, 1/fs, 2/fs, 3/fs, and so on. If these samples are numbered as above with k = 0, 1, 2, …, then the value of the wave at each sample would be
Equation 3.1
Equation 3.1In a period of time equal to τ seconds, there would be m ≈ τ / T = τ fs samples taken and so we can, with some approximation, rewrite the above formula as follows.
Given the sampling frequency fs, the wave
Equation 3.2
Equation 3.2has initial phase of m samples or m / fs units of time, peak amplitude of A, and frequency of f. Its cycle is 1/f units of time or fs / f samples.
The quantity f / fs is known as the normalized frequency of the sampled wave. If, for example, the sampling frequency is 2000 Hz (samples per second) and the frequency of the wave is 500 Hz (cycles per second), then the normalized frequency is ¼. We can say that the frequency of the wave is 1/4th of the sampling frequency. Most data manipulation in DSP uses not the actual frequency of the wave, but the normalized frequency of the wave. This said, for clarity and to the extent possible, we will continue to write f / fs and will not rely on the normalized frequency for computations.
3.3. The Nyquist-Shannon sampling theorem
The Nyquist-Shannon sampling theorem states that the frequency content of a signal is fully represented by sampling at a certain frequency if the signal does not contain frequencies higher than one-half of the sampling rate.
The Nyquist-Shannon sampling theorem, sometimes described as the Shannon-Kotelnikov theorem or the Nyquist-Kotelnikov theorem, is fundamental to any digital sampling of a signal.
The theorem has many variations and the names Nyquist, Shannon, and Kotelnikov are used in various combinations. The following, for example, is the Nyquist-Kotelnikov theorem: "Every signal, which can be integrated in time and has a finite frequency spectrum, can be sampled at intervals of time that are smaller or equal to 1 / (2 fs), where fs is the maximum of the frequency spectrum."
All versions of the theorem say one and the same thing, namely that if we are to properly sample some frequency, we must sample at a frequency which is at least twice that frequency. To represent the frequency 1000 Hz properly, we must use at least 2000 samples per second. If we sample with 2000 samples per second, we can catch
only frequencies up to 1000 Hz.
Fact: It is generally accepted that the human ear can perceive frequencies between 20 Hz and 22 KHz. Thus, a frequency of at least 2 * 22 KHz = 44 KHz is needed to properly represent the frequency spectrum that is humanly audible. This is the reason CD audio uses 44.1 KHz to sample audio data.
A good example of how the Nyquist-Shannon sampling theorem works is to consider sampling at the sampling frequency fs of a simple wave with zero phase and frequency fs. At every sample k, the value of this simple wave would be
Equation 3.3
Equation 3.3This simple wave cannot be sampled properly with the given sampling frequency. Consider also the example in figure 7. The two simple waves with frequencies 1500 Hz and 500 Hz and with different phases are sampled with the frequency fs = 2000 Hz. These two simple waves produce the same value at every sample. Even if these samples do represent the frequency 1500 Hz, any equipment or software would interpret them as the frequency 500 Hz. Such confusion of higher frequencies with lower frequencies due to low sampling rates is called aliasing.
Figure 7. An example of aliasing
An example of aliasingEven though this is the sampling of the wave at 1500 Hz (dashed) with the sampling frequency of 2000 Hz, the sampled values will be interpreted as the wave at 500 Hz (solid). The confusion of high frequencies for low ones due to low sampling rates is called aliasing.
Since at any frequency f we have
Equation 3.4
Equation 3.4aliasing occurs when the frequency fs – f is confused for the frequency f or vice versa.
Chapter 4. Complex signals and simple DSP operations
Practical signals likely consist not of one, but of many simple waves. In fact, a signal that consists of a single simple wave or even resembles a single simple wave is probably artificially created. Figure 8 shows the amplitude of the many simple waves that are contained within one second recording of an acoustic guitar. Note just how many frequencies are contained in this actual example signal.
Figure 8. Frequencies in an acoustic guitar recording
Frequencies in an accoustic guitar recordingThis graph shows the amplitude of frequencies contained in a one second recording of an acoustic guitar solo (some information is lost when editing into black and white). The vertical axis shows magnitudes of frequencies with respect to the maximum magnitude allowed in the recording – from 0 dB (top) to -20 dB (bottom). The horizontal axis spans the frequency spectrum between 1 Hz and 1000 Hz. The picture was produced with Orinj – a multitrack recording software (www.recordingblogs.com).
When a complex signal consists of many simple waves, it is difficult, if not impossible, to decipher what waves are present in the signal by simply looking at a plot of it over time. Even when the signal consists of three or four simple waves, which is still rare, we still may not be able to know the precise parameters of such waves.
Figure 9. A sampled simple wave
A sampled simple waveThis complex signal consists of three different frequencies. We can tell that the signal contains at least a higher frequency with smaller amplitude and some lower frequencies. It is impossible to tell, however, that there are exactly three frequencies and that they are exactly the simple waves cos(t), cos(0.5 t), and 0.4 cos(2π t).
Fact: Sound is usually very complex. Even when an instrument plays the frequency of a single note – the fundamental frequency – that frequency will be surrounded with multiple other frequencies, usually at lower amplitude, with frequencies higher than the fundamental frequency (overtones or partial waves) or lower than the fundamental frequency (undertones or also overtones). Some of those fit nicely with the fundamental frequency, as they are its integer multiples (harmonics). Others do not (inharmonic overtones or partial harmonics). Some instruments produce the same frequency in addition to the main frequency, independently of what note is played (formants). Flutes, for example, have formats around 8 KHz. These additional frequencies define the timbre of a sound. When instruments have different timbre, they sound differently, even when they play the same note.
4.1. Mathematical representation of the complex signal
In theory, we represent a complex signal as the combination of multiple simple waves. A continuous complex signal x(t) can be written as follows.
Equation 4.1
Equation 4.1An, fn, and τn are the peak amplitude, frequency, and initial phase of each of the n simple waves in the signal. Representing complex signals as collections of simple waves is an assumption – a heavy one. Whether complex signals in fact resemble collections of simple waves is discussed in chapter 5.
A complex signal x(t), sampled with the sampling frequency fs, would be
Equation 4.2
Equation 4.2When processing digital signals, we will work with signals for which we do not know the precise values for N, An, fn, and mn