Basics of Acoustics 1

1
Basics of acoustics 1
Recording sound
sound | recording sound | oscillogram | frequency | amplitude | energy | phase | Praat
microphone | analog recording | digital recording | exercises | recording sound on your computer
binary number system | sampling rate | quantization | DAT recording | audio formats
Quantization
We have seen that by using an appropriate sampling rate during digitization, the frequency of the
signal can be stored and reproduced. In order to reproduce the amplitude of the signal, a second
criterion is involved in the digitization process: amplitude resolution, or quantization. This
parameter refers to the number of separate amplitude steps by means of which it is possible to
describe (in order to reproduce it with more or less accuracy), the amplitude of the signal. This
number amplitude steps is indicated in terms bits. 2 bits correspond to 4 amplitude steps (2^2), 4
bits to 16 amplitude steps (2^4), 8 bits to 256 amplitude steps (2^8) and so on.
The recorded amplitude values are discrete and approximated (as a computer has to handle it),
and because of the "jump" from one discrete amplitude step to another, a certain degree of noise
(quantization noise) arises in the digital signal. The more bits the amplitude resolution, the
smaller the quantization noise: each additional bit improves the quality, as the signal to noise
ratio gets lower, the more the amplitude steps are. For example, with an eight bit sample we get
256 amplitude steps, with signal to noise ratio 256:1. With a 16 bit sample we get 56636
amplitude steps, with a much lower signal to noise ratio.
For any number of steps, an amplitude corresponding to half the number of steps can be
computed, as amplitude is measured from the zero reference value (which divides the pattern of
oscillation of the particles in two). So, for 8 bits (256 amplitude steps), 128 amplitude values are
recorded.
Take a look at figure 1.4: in the upper picture, a 2 bits amplitude resolution is applied (4 steps, 2
amplitude values). The smaller the amplitude resolution, the higher the noise that arises in the
digitized signal. In the lower picture a 4 bits amplitude resolution is applied (16 steps, 8
amplitude values). A better signal/noise ratio is achieved, compared to the 2 bit amplitude
resolution.
2
Figure 1.4
Oscillogram of a signal (sine). Two different amplitude resolutions (2 bits and 4 bits) are applied
to the same signal.
In phonetics, the most common amplitude resolution is 16 bits (2^16 = 65536 amplitude steps).
65536 amplitude steps corresponds to 32768 amplitude values, which reasonably covers the
amplitude range of human hearing:
20 * log (amplitude value / reference value) = 20 * log (32768 / 1) = 20 * 4.515 = 90.3 dB
Also Praat saves sound files with a 16 bits resolution per sample per channel (it also plays sound
files recorded with a higher number of bits per sample) as 16 bits.
DAT recording
DAT, Digital Audio Tape recording is a technique which, at the present moment, provides the
best quality digital signals. The input signal is directly sampled by the DAT recorder and
converted into digital values that are written on a tape. As you can see in figure 1.5, the DAT
tape looks like a compact audio cassette, but it is smaller. DAT uses a sampling frequency of 48
kHz and an amplitude resolution of 16 bits, without audio data compression (see audio formats).
For more information see the website on the [INSERT LINK WEBSITE LABORATORIUM].
Figure 1.5
Picture of a DAT tape
3
Audio formats
Audio samples can be digitally stored and saved in a computer file in different audio file
formats. As we have seen in digital recording, the signal is digitized according to the parameters
sampling rate (number of samples per time unit) and resolution (number of bits per sample). The
data which is obtained can then be compressed (or remain uncompressed) in order to regulate its
size. Compressed audio formats are also called lossy, as some information about the original
signal gets lost in the compression. Conversely, uncompressed audio formats are called lossless,
as no loss of information occurs.
Some of the most commonly used audio file formats are:

- .wav Microsoft standard format, uncompressed (lossless)
- .aiff Apple format, uncompressed (lossless)
- .mp3 compressed format (lossy)
Lossless and lossy formats meet different application requirements. A compressed format is
easier for a computer to handle with but it is not suited for all applications. For instance, mp3
audio format is the main format for music recording: it allows to reproduce the sound signal in a
way that the human ear does not perceive the difference from the original signal, but it is not
suited for phonetic analysis (as some of the original information is approximated or is not
reliably present any more at all). In this sense, compression techniques take advantage of the
idiosyncrasies of human perception processes, by removing information that would not be
perceived anyway.
Below we will see how to play (and how to record) audio files on your computer using Praat.
Anyway, several other programs can be used to playback audio and video files and pictures, such
as Windows Media Player (with Windows, go to Start > Programs > Accessories >
Entertainment (or Multimedia) > Windows Media Player), WinAmp, iTunes (Apple).
Basic of acoustics 1
Questions and excercises A.1:
A.1.1)
What is a sound wave, physically speaking?
Answer:
A sound wave is a cycle of pressure fluctuations (compression, rarefaction) brought about by an
oscillating (sound) source. The extent to which the particles of the acoustic medium involved in
the pressure fluctuation deviate from their rest position is measured in Pascal.
A.1.2)
Explain why a sound wave loses energy the further it is spread from the oscillation source
4
Answer:
The more a sound wave moves away from the source (pay attention: the sound wave travels but
the particles themselves do not move along with the sound wave), the more air (medium)
particles are involved in the wave. The amount of initial energy (spread with the source
oscillation) is spread over a larger surface and is involved in the displacement of more particles.
As a consequence, the overall amount of energy remains the same, but the effect (in terms of
displacement) of this energy on a single medium particle or on a single portion of the sound
wave is smaller. Thus, the sound wave slowly fades by moving away from the source.
A.1.3)
What parameters influence the speed of a sound wave?
Answer:
The propagation speed of a sound wave depends on the acoustic medium: different media have a
different characteristic sound propagation speed. In gases (like air) the sound speed is lower than
in liquids or solids, as the particles are less close to one another (the medium is less dense).
Moreover, some factors (like temperature and pression) influence the density of the acoustic
medium, which determines slight changes within the range of sound propagation speed peculiar
of that medium. Heat, for instance, makes the particles less close to one another, and slows down
the sound speed. For gases like air, sound speed is determined by their chemical composition: a
higher degree of humidity in the air corresponds to a bigger concentration of water, which has a
higher sound speed and thus positively affects the sound propagation speed.
Questions and exercises A.2:
A.2.1)
What is an alias frequency? How does it arise in a digitized signal?
Answer:
An alias frequency is a frequency that arises in the digitized signal even though it is not present
in the original signal. This is due to a wrong (too low) sampling frequency of the original signal.
The sampling frequency has always to be at least twice the highest frequency component of the
original signal.
A.2.2)
What is the nyquist frequency?
Answer:
The nyquist frequency is half the sampling rate. In order to prevent frequency components that
are not present in the original signal to arise in the digitized signal, besides chosing an
appropriate sampling rate (at least twice the highest frequency component in the original signal),
the original signal is filtered, beforehand, in order to get rid of the frequencies that are higher
than the nyquist frequency (half the sampling rate).
A.2.3)
A 2500 Hz frequency signal has to be digitized. What is the minimum sampling frequency
required to avoid alias frequencies? And the amplitude resolution?
Answer:
The sampling frequency must be at least twice the frequency of the signal, thus at least 5000 Hz.
The amplitude resolution does not depend on the frequency of the signal nor on its amplitude,
but it rather depends on the purposes of the digitization. We have seen that for speech an
5
amplitude resolution of 16 bits is used.
A.2.4)
What is quantization noise? How can it be avoided?
Answer:
Quantization noise is the noise that arises by quantizing the amplitude of a signal during
digitization. It is a consequence of the jump from one discrete amplitude level to the other. The
signal to noise ratio (the relative loudness of quantization noise) can be made lower by using
more bits for the quantization.
A.2.5)
Why is it not advisable to carry out detailed phonetic analyses on speech saved as an mp3 file?
Answer:
Mp3 files are lossy files: the signal is compressed in order to reduce the amount of data. This
data compression does not affect the perceptual effect, but makes the digitized signal is not
suited for analyses.
Questions and exercises A.3:
A.3.1)
What features of sound wave do we consider, talking about frequency?
Calculate the frequency of a sound signal a with a 0.005 s period and that of a sound signal b
with a period of 0.0025 s.
What is the relation between frequency and period?
Answer:
The frequency of a sound signal is the number of repetitive patterns of pressure oscillations
(periods) in a sound wave per second. The standard measure unit of frequency is Hz (cycles per
second).
Frequency affects our perception of tonal height of a sound.
f of signal a: 1 / 0.005 = 200 Hz
f of signal b: 1 / 0.0025 = 400 Hz
Frequency and period are inversely proportional: the shorter the period, the higher the frequency.
A.3.2)
Calculate the frequency and the period of, respectively, a sound signal a with a 0.835 m wave
length and a sound signal b with a 1.67 m wave length (medium is air at 21ºC).
What is the relation between frequency and wave length? And between period and wave length?
What is the frequency of the same signal travelling through a different medium (let's consider
water, propagation speed 972 m/s)?
Answer:
The speed of sound propagation through air at 21ºC is 334 m/s,
f=V/λ
T=λ/V
f of signal a = 334 / 0.835 = 400 Hz
T of signal a = 0.835 / 334 = 0.0025 s
f of signal b = 334 / 1.67 = 200 Hz
T of signal b = 1.67 / 334 = 0.005 s
Frequency and wave length are inversely proportional, while period and wave length are directly
6
proportional. Pay attention not to mix period and wave length up: period is a time measure (s),
while wave length is a spatial measure (m)!
A.3.3)
If a person in a stadium is participating to a stadium wave by standing up and raising his hands
every 13 seconds, what will be the frequency of the wave (given that the pattern is very regular
and every one is sitting for the same amount of time)? If the wave is travelling 4 meters per
second, how distant are the people who are standing up in the same moment?
Answer:
T = 13 s
V = 4 m/s
f = 1 / T = 1 / 13 = 0,077 cps
λ = V * T = 4 * 13 = 52 m
λ = V / f = 4 / 0,0769 = 52 m
A.3.4)
What is the frequency of the sine signal in figure 1.6?
Answer:
The period (T) of the signal is 0.00227739 s (bottom, left).
f = 1 / T = 1 / 0.00227739 = 439.1 Hz
A.3.5)
Draw the oscillogram of a sine signal with a 100 Hz frequency. Add the time indications on the x
axis, at regular intervals.

A.4.1)
What feature of the sound wave does amplitude assess? Which graphical representation is suited
for representing the amplitude of a sound wave?
Answer:
Amplitude assesses the degree of displacement of the medium particles in the sound wave. Their
displacement, which is due to changes in pressure, is measured from the zero reference value,
that is the rest position of the particles.
Being actually a pressure value, amplitude is measured in Pascal (Pa). The best way to visualize
amplitude is through an oscillogram, which displays the amplitude variations (on the y axis) in
the time dimension (on the x axis).
A.4.2)
What is the root mean square (RMS) amplitude? Why is this measure more suited in order to get
the average amplitude of a sound signal, rather than simply calculating the average of the
pressure displacements (from the reference, namely the rest position of medium particles)?
Answer:
We get the RMS amplitude by calculating the root of the average (sum of all the values / number
of the values summed) of the squared values of pressure displacements.
Squaring the values of the displacements before summing them prevents the sum from equalling
zero (as opposite displacements would cancel out each other). Moreover, let us consider two
7
signals a and b having the same (maximum) amplitude, a reaching this amplitude just once in the
signal, while b reaches it lots of times (there are more maximum amplitude displacements in b
than in a). The two signals clearly differ from each other, but this difference would not be
expressed by the mere average of the displacements. Conversely, RMS amplitude assessment
takes this aspect into account (a sound signal characterised by a higher number of large
displacements has a higher RMS value than a a sound signal having its same amplitude, but with
a lower number of big displacements).
A.4.3)
If a given sound signal has, at a distance r from the sound source, a given intensity x, how many
times will this intensity be lower at a 2r distance from the sound source? Why?
Answer:
The intensity at a 2r distance from the sound source is of 1/4 the intensity at r distance from the
sound source, because the degree of energy is spread over a surface which is 4 times bigger than
on the surface at r distance.
A.4.4)
What is the relation between amplitude and intensity? What is the difference, respectively,
between amplitude and amplitude level, and intensity and intensity level?
Answer:
We have already seen that amplitude is a value which refers to pressure, it is measured in Pascal.
It is the maximum displacement of the medium particles from their rest position (in the sound
wave). The overall amplitude of a sound is calculated as the root of the average of the squared
displacements (RMS).
When talking about amplitude level we refer to a relative value, measured in dB, which results
from a logarithmic function assessing the ratio between the amplitude (RMS) of a given signal
and that of a reference signal. This is more suited as we are dealing with human perception,
which is not aware of absolute values. The standard unit of measurement of the amplitude level
is the dB SPL (sound pressure level).
Intensity refers to the degree of energy which is present in a given sound wave. It is directly
proportional to amplitude, or, more precisely, it is directly proportional to the squared amplitude.
Also intensity can be expressed in dB. The intensity level scale is called dB SIL (sound intensity
level).
Doubling the amplitude of a signal leads to an increase of 6 dB in the amplitude level, and to an
increase of 3 dB in the intensity level (which has to do with the fact that intensity is directly
proportional to the squared amplitude).
A.4.5)
What are the advantages of the dB scale? There are different dB scales, depending on the
reference quantities considered. What dB scale do we use to assess loudness? Could we use this
measure also for a different kind of physical quantity (different from pressure)?
Answer:
The dB scale measures ratio's, namely relative values (with respect to a reference value) rather
than absolute ones, which is closer to our perception. Moreover, by using the dB scale, a very
broad range of values is converted into a relatively small range: the difference between a 20 Pa
amplitude (human threshold of pain) and a 0,00002 Pa (human threshold of hearing) is expressed
by 120 dB SPL (sound pressure level), or 60 dB SIL (sound intensity level). Finally, it is
important to recall that the dB scale is not only used for amplitude or intensity. There are
8
different kinds of dB scales, as dB rather refers to the ratio between two values. So, it is possible
to measure also electric voltage or electric current in dB.
A.4.6)
During a conversation, the intensity level of the signal is of about 60 dB SIL, while during a
concert it could be up to 100 dB SIL. How many times is the concert louder than the
conversation?
Answer:
100 dB - 60 dB = 40 dB = 4 Bel
Remember the formula to calculate the intensity level:
Li = 10 * log ( I / Io )
where the logarithm is multiplied by 10 in order to get from Bel to dB
A 4 Bel difference is equivalent to an intensity increase of 10^4.
The intensity (loudness) of a concert is 10^4 (10 * 10 * 10 * 10 = 1000) times more intense than
during a normal conversation
A.4.7)
Did you pay attention to the dB scale indicated in excercise 4.6? The indications were given in
terms of intensity amplitude, SIL. Let us now turn to amplitude level (SPL).
If an electric signal with an amplitude of 60 mV is increased by 20 dB SPL, how much will its
new amplitude be?
Answer:
The formula to calculate the amplitude level is
Lp = 20 * log (p / po)
therefore we have to divide the sound pressure level by 20
20 dB / 20 (as in the formula the logarithm is multiplied by 20) = 1
10^1 = 10 times
60 * 10 = 600 mV
The increase by 20 dB SPL of the amplitude's level of the original signal corresponds to an
amplitude increase by 10 times, namely form 60 mV (original amplitude) 600 mV.
A.4.8)
A signal a has an amplitude level of -6 dB (SPL). How many times is it necessary to increase the
amplitude of a signal a if we want to transform a into a signal b having a sound pressure level of
0 dB? You should already have a clue of what a 6 dB SPL change means in terms of amplitude.
However, try to calculate it by applying the formula we have already used in the previous
exercise.
Answer:
As a first thing, 0 dB means that the amplitude level of the considered signal equals that of the
reference signal.
Doubling the intensity of a signal determines a 6 dB increase in the sound pressure level (SPL).
Therefore we have to double the amplitude of a.
The formula to calculate the amplitude level is
Lp = 20 * log (p / po)
6 dB / 20 (as in the formula the logarithm is multiplied by 20) = 0.3
10^0.3 = 1.99 times, as predicted.
Notice that this proves true also what we already know about the fact that doubling the intensity
of a signal leads to an increase of 3 dB SIL:
9
Li = 10 * log ( I / Io )
3 dB / 10 = 0.3
10^0.3 = 1.99 times
A.5.1)
Consider two sine waves a and b, whose amplitude oscillation starts with the same pattern (from
the zero reference line to the + 1 relative amplitude point). Their periods are, respectively, 0.01
and 0.03 s. At what position in the period (position in time, s) will each of them be in phase
position φ 90º?
Answer:
Phase position = (Position in period (ms) * 360º) / T
therefore
Position in period = (Phase position * T) 360º
For signal a, the position in period corresponding to 90º is of 0.0025 s ((90 * 0.01) / 360º).
For signal b, the position in period corresponding to 90º is 0.0075 s.
A.7.3)
a)
- Make the oscillogram of the sound file sine_a.wav, zoom into it (about in the middle) and
select half a period (from maximum to minimum)
- cut this fragment and note the position of the cursor at the time where the half period was cut
- save this signal with a new name
b)
select half a period (from a positive zero crossing to a negative zero crossing)
c)
select a whole period (from a positive zero crossing to the next positive negative zero crossing)
d)
- zoom into the oscillogram of these signals so that you can see the point where the segment was
cut
- take a look at both the oscillogram and the spectrogram (make sure the spectrogam is displayed
in the Edit modus (View > Show analyses) and listen to the signals. What do you notice?
Hint:
By listening to the new signals (a, b, c) and by looking at their spectrograms you should come to
the conclusion that the signal c is the only outcome of a proper modification of the signal. While
listening to a and b you can hear a click, which is the consequence of the fact that either no
complete period has been cut (b) or the signal was not cut in correspondence of the zero crossing
(a). Cutting has to be done in correspondence of the zero crossing, and a whole period has to be
cut.
10
A.7.5)
Make a sine signal (0.025 s) having a frequency of 200 Hz. Draw its oscillogam in the picture
window, and possibly print it. What would happen if this signal was digitized with a 250 Hz
sampling frequency? What would be the aliasing frequency? What should be the (minimal)
sampling frequency, in order to prevent aliasing frequencies to arise in the digitized signal?
Hint:
By digitizing the 200 Hz signal by a 250 Hz sampling rate, one sample every 0.004 s (1 s / 250)
would be stored from the original signal (whose period is 1 / 200 = 0.005 s). In order to properly
digitize the signal (avoiding alias frequencies), a sampling rate of at least 400 Hz (twice its
frequency) should be used (1 / 400 = 0.0025 s).
Recording sound on your computer √

Oscillogram: a way of visualizing sound waves
The oscillogram is a graphical representation of the sound wave which has been recorded by the
microphone. The x-axis represents the time dimension, while the y-axis represents the sound
wave pressure. The air pressure fluctuations (compression and rarefaction cycles) are visualized
by a sinusoidal curve. Conventionally, the maximum pressure (compression) is identified at the
top of the sine (amplitude) and the minimum pressure (rarefaction) at its bottom (amplitude -1, it
could have been also the other way round).
We would get a similar representation by fastening a pencil on the very center of the microphone
membrane (with the longitudinal axis of the pencil parallel to the membrane) and having it trace
a line (on a stripe of paper which moves horizontally over time) while the membrane goes up and
down because of the compression and rarefaction pattern of the air molecules.
Figure 1.6 shows the oscillogram of a (periodic) sine tone.

11
Figure 1.6
Oscillogram of a periodic sound signal whose pressure fluctuations resemble a sine tone. The x
axis represents time, the y axis stands for amplitude. The maximum deviation from the zero
reference line is the amplitude of the signal
A sine tone can be synthesized electronically, but we will not encounter it in nature (however, a
vibrating diapason produces a sound which approximates a sine tone). The oscillograms of
sounds which we come across in our daily life look much different. Figure 1.6 shows the
oscillogram of a speech segment.
Figure 1.7
Oscillogram of a speech segment (vowel /a/). You can hear the file (and download it to your
computer as aa.wav)
at http://www.let.uu.nl/~Hugo.Quene/personal/onderwijs/sprekenenverstaan/practicum20070725
.zip
Frequency
The frequency (f) of a sound wave, which is perceived as its tonal height, or pitch, is the
number of regular pressure fluctuations which occur within a given time interval. It is measured
in Hertz (Hz) as cycles per second (cps), every cycle corresponding to the oscillation pattern
between two consecutive maxima (or between two minima, in general we would speak of two
consecutive points having the same phase values).
The duration of a cycle of pressure fluctuations (thus considering the time dimension, in
seconds) is called period (T) of the sound wave. It is inversely proportional to the frequency
value.
Frequency can be calculated by counting how many periods occur per time unit (1 second) or by
12
dividing the standard time unit (1 s) by the duration of the period (T, which is measured in s):
f = 1/T (with 1 referring to 1 second)
With the term fundamental frequency (F0) the lowest frequency of a signal is indicated. In the
case of a sine tone (not a complex signal, such as the ones we are dealing with now) just one
frequency component is present, it is thus not necessary to indicate it as F0. In complex signals,
there are more frequency components. The lowest of those frequency components is called
fundamental frequency (F0), and it is the minumum common denominator between the
frequency comopnents. For instance, the F0 of a complex signal composed by sine tones with,
respectively, a 50, 100 and 150 Hz is 50 Hz.
If we draw the oscillogram of a sound wave with a period of 5 ms (0,005 s), we would see that
the same pressure pattern (period) repeats 200 times in the frame of 1 second. This means that
the sound has a frequency of 200 Hz (f = 1 / 0,005).
Figure 1.8
Two sound waves whose frequency is, respectively, of 200 and 400 Hz
It is also possible to measure the length of a cycle in the medium (considering the space
dimension, in meters). In this case we are dealing with the wave length (λ, in m) of the sound
wave. The wave length thus depends on the sound wave propagation speed (V, in m/s) and on
the frequency.
The speed of sound propagation (V) is entirely dependent on the physical characteristics of the
acoustic medium: for gases like air it depends on parameters such as temperature, humidity
(chemical composition in general) and altitude. As air is a mixed gas, it is quite straightforward
that a different mixture of given elements within the medium directly affects the sound
propagation speed through it. Water, for instance, is characterised by a higher sound propagation
speed; as a consequence, the higher the concentration of water particles (i.e. humidity), the
higher the sound speed , and vice versa.
λ=V/f
λ=V*T
or, the other way round:
f=V/λ
T=λ/V
If we go back to the metaphor of the stadium wave, T is the time period between two
13
consecutive actions of standing up by the same person, f is the number of actions that occur
within a given time unit and λ is the spatial distance between two people who are standing and
raising their arms in the same moment.
Amplitude
The amplitude (A) of a sound wave is the degree of change in air pressure (due to compression
and rarefaction of the air molecules), which is measured in Pascal. In an oscillogram the
amplitude of a sound wave corresponds to the maximum sine displacement (peak deviation
from the zero reference) value, which is the maximum displacement of the particle of the
medium from its equilibrium position. In the stadium wave, the amplitude could be thought of as
the extent to which people rise their hands. The higher the pressure changes caused by the sound
wave, the higher the amplitude, and the louder we perceive the sound. However, there is a more
suitable way of assessing the loudness of a soundwave, which is calculating its intensity.
Root mean square amplitude

How can we calculate the mean amplitude of a sine wave? It would be misleading to make a
simple average over the displacement values at the different time points, as the negative values
(below the zero reference line) would cancel out the positive ones, yielding zero. It is thus
customary to calculate the root mean square (RMS) amplitude of the sine displacement
values, namely the root of the average of the squared displacements:
√ sum of squared amplitude values / number of the amplitude values
In a sine tone, the RMS of the amplitude corresponds to 0.707 times the maximum amplitude of
the signal.
By squaring the displacements we prevent the sum from being zero; furthermore, it is also a way
to give more "weight" (in the calculation) to large displacements (high amplitude). Doing so, a
sound signal characterised by a higher number of large displacements will have a higher RMS
value than a sound signal having its same amplitude, but with a lower number of big
displacements.
Energy, Power and Intensity
When discussing the amplitude of a sound wave we have focused on the maximum displacement
of the single particles of the acoustic medium. But let us consider the sound wave in a broader
perspective, namely as travelling through the medium with a spherical propagation pattern. We
will then want to consider the degree of energy that is imparted by the vibrating sound source
and that is transmitted throughout the medium over time by the air particles bumping into one
another, and the degree of energy which will be available at a given time at a given distance from
the source.
Energy is a property which has to do with matter and with waves. It is defined as the capacity of
a body to perform work, namely to exert a force causing a displacement, and it is measured in
Joule [J]:
14
Energy [J] = Force [N] * Displacement [m]
Energy is independent from the time dimension, so if we want to take time into account we
should introduce the concept of power, which is energy per time unit and is measured in Watt.
Power [W] = Energy [J] / Time [s]

or
Power [W] = Force [N] * Displacement [m] / Time [s]
As far as the propagation of the sound wave is concerned, power could be thought of as the total
amount of energy contained on the sphere’s surface. The total energy within the sphere will
always remain the same; moreover, energy is never wasted, it just turns into different kinds of
energies . Nevertheless, the more the sphere increases (the more the wave moves away from the
sound source), the more medium particles are involved in the pressure fluctuations, and the more
the energy is spread all over its surface area by forcing the neighbouring molecules.
We get the intensity of a sound wave by combining the power of the original sound source with
the area of the (imaginary) portion of sphere we are considering (which depends on the distance
of measurement from the sound source).
Intensity [W/m2] = Power [W] / Area [m2]

or
Intensity = Energy [J] / Time [s] * Area [m2]
or
Intensity = Force [N] * Displacement [m] / Time [s] * Area [m2]
Intensity spreading from the signal source
The amount of energy in the area of the expanding sphere's surface undergoes an exponential
decrease by the inverse square law (the energy drops off by 1/distance^2): acoustic energy which
is twice the distance from the source is spread over four times the area and has thus one-fourth
the intensity.
We can get to this if we recall the formula for calculating the surface of a sphere (surface area of
sphere = 4πr2). Here r indicates the radium of the sphere, which corresponds in our case to the
distance from the source. While the sound wave moves from the signal, its intensity is spread
over a broadening sphere surface. The overall intensity (calculated on the whole surface of the
sphere) remains unchanged, but the intensity in a given portion of the sphere's surface (let's say,
a square centimeter) is lower and lower the more the wave moves from the source. The intensity
on the surface of the sphere undergoes a a dropoff which can be calculated as Is / 4πr2, being Is
the intensity at the source and r the radius (the distance from the source) .
Amplitude level and intensity level: decibel scales

While power is measured in Watt, in acoustics, intensity is mainly expressed with a different unit
measure: the decibel (dB). Decibel is a relative unit that reflects our perception of the loudness
of a sound wave, rather than measuring absolute values.
The human ear does not perceive absolute differences, but rather relative ones (as a ratio between
two values).
15
This is true also for other kinds of human perception of physical quantities: for example, the
difference in weight between a 2 kg object and a 1 kg object is perceived as greater than the
difference between two objects of, let's say, 51 and 50 kg, although we are dealing with a 1 Kilo
difference. In the first case, the ratio between the 2 objects is 2/1 = 2, while in the second it is
51/50 = 1.02, which is far lower.
As far as loudness is concerned, rather than considering absolute values it is more suitable to
measure the ratio between that value and a reference value: equal ratios between intensity values
are perceived as equal differences in loudness.
The amplitude range of human hearing goes from about 20 mikropascal (0,00002 Pa, threshold
of hearing) to about 20 Pascal (threshold of pain).
On the left hand column of the following table a serie of amplitude values are listed from the
lowest hearable one on top to the highest bearable one at the bottom, each resulting from a
decuplication of the previous one. The number of powers of ten to get from one value to the
following one is indicated in the right hand column.
Amplitude values Order of magnitude Powers of 10

0,00002 Pa 0 -5
0,0002 Pa 1 -4
0,002 Pa 2 -3
0,02 Pa 3 -2
0,2 Pa 4 -1
2 Pa 5 0
20 Pa 6 +1
By considering the powers of 10 to get from a reference amplitude value (for instance the lowest
hearable amplitude) to another given amplitude value we deal with ratios, instead of with
absolute values.
Logarithms are operations used to calculate the ratio between two quantities: the logarithm of a
number x is the power to which a base number has to be elevated in order to obtain x. The base
is usually ≈ 2 or 10, or a constant number e (e ≈ 2.71).
For example:
log 10 x = 3
or
10
log (x) = 3
with x = 1000
as
10^3 = 1000
Let's take as an example the loudest amplitude bearable by the human ear, namely 20 Pa, and
calculate its logarithm with respect to the lowest amplitude hearable as a reference value:
log (RMS amplitude / reference amplitude) =

log (20 Pa / 0,00002 Pa) =
log 1000000 = 6 Bel
16
The result of this logarithm is 6, and its unit is the Bel (named after Alexander Graham Bell,
inventor of the telephone), which also corresponds to the number of times the value in the
numerator has to be multiplied by ten in order to obtain the value in the denominator. This little
value we obtained expresses the ratio between the sound with the highest bearable amplitude to
the one with the lowest hearable amplitude, which corresponds to a considerably wide range of
amplitudes. This value was divided by 10, becoming a decibel (dB).
Let's now consider the logarithm for amplitude (or pressure). The amplitude values are not
absolute pressure values, but rather RMS values:
Lp = 10 * log (p / po)2 = 20 * log (p / po)
being:
Lp = amplitude level in dB SPL, defined as the base ten logarithm of the ratio of a given RMS
amplitude value (p) to the RMS amplitude of a reference signal (po)
p = given RMS amplitude value [Pa]
po = RMS amplitude of the reference value [Pa]
Going back to the relation between intensity and amplitude: the higher the amplitude of the
sound wave, the greater the rate at which energy is transported, and the higher the intensity.
More precisely, intensity is directly proportional to the square amplitude of a soundwave (for
instance, if the amplitude doubles, the intensity is increased by four times).
Li = 10 * log ( I / Io )
being:
Li = intensity level in dB SIL (sound intensity level), defined as the base ten logarithm of the
ratio of a given intensity value (I) to the intensity of a reference signal (Io)
I = given intensity value [W/m2]
Io = reference value for intensity [W/m2]
A doubling of the amplitude leads to an increase of the amplitude level of 6 dB, as

20 * log (2 * p / po) = 20 * log 2 = 6 dB
while an intensity doubling leads to an increase in the intensity level of 3 dB
10 * log (2 * I / Io) = 10 * log 2 = 3 dB
We will encounter the 3 dB intensity level indication again while studying filters: as a matter of
fact, a filter cut-off frequency is defined as the half-power point, which corresponds to -3dB.
By using decibels, the huge variety of values between the threshold of hearing and the threshold
of pain are reduced to a small number of values, which make this scala very handy. Moreover,
the use of decibels makes it possible to compare values wich are expressed in different measure
units, such as on electric voltage (Volt) or electric current (Ampere).
There are different dB scales, depending on the reference values that are used. What we have
seen so far are two kinds of dB measurements, one for power and intensity (dB SIL, sound
intensity level) and one for amplitude (and also voltage) (dB SPL, sound pressure level) scale.
Conventionally, to assess loudness, we use dB SPL.
17
It is also possible to have negative dB values, which means that a given amplitude is smaller than
the reference amplitude. A 0 dB amplitude level, instead, means that the given amplitude we are
considering equals the reference amplitude.
Phase
Another distinguishing feature of the sound wave is phase. Phase refers to the position of a point
in the sound wave cycle by measuring it in angle degrees (from 0º to 360º). If we think about the
representation of a sinusoidal sound wave in an oscillogram, 0º, 180º and 360º respectively
correspond to the imaginary 0 amplitude values, 90º and 270º to the +1 and –1 amplitude values.
Let us take a look at the figure below in order to have a better understanding of what phase is.
We see that the two illustrated sound signals have the same amplitude and the same frequency,
but still differe from each other. Are then amplitude and frequency values not enough to
univocally identify a sound wave? No, they are not: as a matter of fact, we also need an
information about phase: in this case, at time 0,005s the phase of the two soundwaves is,
respectively, of 0º and of 90º.
Figure 1.9
Periodic signals having the same frequency and amplitude, but a different phase
We get the value of phase position φ (º) by multiplying the time value of the position we are
considering in the period (which is measured in seconds from a zero displacement point of the
wave, corresponding to 0º phase) by 360º, and dividing it by duration of the period.
Phase position, φ (º) = (Position in period (s) * 360º) / T
As the reference value is the respective period duration (T), 2 points which are in the same
relative position in their periods have the same phase value (even though the periods might be
different).
By adding two soundwaves having a given amplitude, frequency and phase (which is
what happens in real acoustic environments), both their frequencies and amplitudes and their
phase are combined. For example, by combining two sounds having the same frequency and
phase, the result is a signal which has the same frequency (and phase), but an increased
amplitude (as the signals' amplitudes add up). However, if the two signals have the same
18
frequency but an opposite phase (180º out of phase), then the two signals cancel each other
out. This phenomenon is currently exploited in technological applications such as noise
cancellation headphones, which monitor the environmental sounds and cancel out (by creating
an antiphase signal) what is identified as unwanted noise. This procedure allows to dampen the
environmental sounds.
Also the stadium wave is characterised by a given phase, as two persons will have different
positions at the same time within the wave, one going back sit, for example, and the other one
just standing up. Vice versa, all the people who are at the same moment in the same position will
also have the same phase value. If there were two waves travelling through the stadium, one
some rows behind the other, they might be out of phase with respect to each other (as well as
they might have a different frequency and amplitude), but there might also be some
moments when the two waves have the same phase at a given position in time, for instance when
two persons (1 for each wave), one for behind the other, stand up at the very same moment.
Phase does not actually have a perceptual correlate: while we experience amplitude (as it is
proportional to what we perceive as loudness) and frequency (which perceptively corresponds to
the tone height, or pitch),we are not aware of phase. However, our brain exploits the phase
information in order to recognise the location of a sound source (and also for other kind of
information such as the metereological conditions, which in open-field acoustics is affected by
the phase).
Getting started with Praat

Let us get started with Praat. Instructions on how to download it are to be found on the
homepage of this tutorial. Information about the program are to be found by clicking on the Help
menu by clicking on the Help button on the left top of the window. From here it is possible to
search directly for a topic (Search Praat manual), to look it up in an index (Go to manual page)
or to follow the tutorial (Praat intro).
Oscillogram
Now we want to visualize the oscillogram of the file "zinleven1.wav" which you saved to your
computer:
- open the file from Read
- select the file and click on Edit
- by default, you should see an image like this one (however, the analysis settings are saved from
session to session)
19
Figure 1.10
Oscillogram (upper panel) and spectrogram (lower panel) of the sentence "Het leven is mooi als
de zon schijnt", male speaker. Screenshot from Praat. The pitch contour (blue line) presents
octave errors (see further), which can be avoided by adjusting the setting parameters.
You will probably recognize the oscillogram in the upper panel (as it may remind you of figure
1.7, which is just the oscillogram of a shorter speech segment). This is a speech sample (thus a
complex tone) oscillogram, which differs very much from the oscillogram of a tone (see figure
1.6). Let us first focus on speech oscillograms, later we will then see how to synthesyze a sinus
tone from a formula [INSERT LINK] and display its oscillogram.
By default, in the panel with the oscillogram, you should see also a spectrogram of the sound
signal, at the bottom. As a matter of fact, there are different ways of graphically visualizing
20
sound. The oscillogram is just one possibility, it takes time (on the x axis) and amplitude (sound
pressure, on the y axis) into account.
In order to focus on the oscillogram, let us remove the spectrogram from this window:
- select View > Show analyses from the menu
- remove the spectrogram by deselecting Show spectrogram in Show analyses
- click on Apply. What do you see now?
- go back to Show analyses and deselect Show formants, click on Apply. What do you see?
- deselect also Show pitch, now you see only the oscillogram.
We will see spectrogram and formants later on. Instead, we have just seen what pitch is, namely
what we perceive as the tonal height, it is affected by the frequency (F0) of a signal, as well as
by its formants. In Praat, the term pitch is actually used to indicate F0. As well as formants, also
pitch is visualized in spectrograms (rather than in oscillograms which do not directly consider
frequency).
Let us focus on the oscillogram now: on the y axis it displays the amplitude. The amplitude value
indications are given on the left (minimum to maximum amplitude) in black characters. The
amplitude value corresponding to the given moment in time indicated by the cursor is reported in
blue characters. These indications refer to the relative amplitude (from 0, rest position of air
particles, no displacement due to pression, to 1, compression, or to -1, rarefaction)
In the oscillogram panel you just opened, the cursor (red dotted vertical line) is positioned in the
middle of the oscillogram (time dimension).
Under the oscillogram there are 3 horizontal button bars. The first one from the top is divided in
2 parts by the cursor line. The duration of each segment is indicated on each part of the botton
bar. In this file, if the cursor is still in the middle of the oscillogram (as by default) the indicated
duration equals the time indication indicated in red by the cursor.
- Click on the first part of the botton bar (left): you will hear the first half of the recorded
sentence
- now change the position of the cursor by clicking somewhere else in the oscillogram, for
instance after "zon" and listen to the sample
- select a portion of the oscillogram by moving the mouse keeping the left button clicked: The
selected portion will be highlighted in a reddish colour
- click on the sel button (below, left): you will zoom into the selected part of the oscillogram.
This operation can be repeated again and again
- zoom in or out the oscillogram (without previous selection) by clicking on the in and out
buttons
- click on all in order to get back to the original visualization of the signal
The second botton bar from the top indicates the duration of the visible part, as well as the
starting and ending time in blue numbers
- again, select a shorter portion of the signal or zoom into it: the indication of the duration of the
visible part of the signal changes
- starting from the left, click on all the parts of the second button bar: you will listen to the whole
file in pieces (with relative indication of the duration time of each segment)
21
The third and last button bar from the top indicates the total duration of the file, irrespective of
whether we zoomed into it or not
- click on this button: you will hear the whole utterance
Praat also provides you with the possibility to do everything also by using the menu, try this also
out:
- from the Query menu you can get several information, for instance about the time
dimension: Get start of selection, Get cursor, Get end of selection, Get selection length). There
are also shortcut keys for some frequent queries
- from the View menu you can move through the signal as we were doing before with the all, in,
out, sel buttons: Zoom, Show all, Zoom in, Zoom out, Zoom to selection, and also Scroll page
back and Scroll page forward, Play (you have to enter time indications).
- from the Select menu you can move the cursor over the signal. In some of the operations that
you can perform under the View and Select menu's your are asked to enter time indications: this
allows higher precision than by using the cursor.
Pitch
If we want to consider pitch (see the paragpraph about frequency), we have to look at the
spectrogram, as an oscillogram is not a representation suited for this parameter. Anyway, we will
get more acquainted with spectrograms in one of the following modules; let us now focus on
pitch.
- open the speech file "girl" in Praat and make an oscillogram of it

- zoom in the oscillogram (you can scroll over it with the scroll bar below), look at its
development over time and listen to some selected portions of it. You will notice that some
phones (vowels) have a more regular period pattern than other ones (while consonants
hardly display a period pattern at all)
- choose a vowel with a rather regular period pattern and select a portion out of it (containing
several periods),
- calculate the frequency of the speech sample using the duration of the period (you get this
information on top, after selecting the period). Actually, it would also be possible to calculate the
frequency by counting the number of periods occurring within a speech sample of 1 second,
but we don't have one entire second of periodic speech available in this audio file.
- display the pitch contour if it is not displayed yet: go to Pitch > Extract visible pitch contour
(you can also go to View > Show analyses > Show pitch). Now the frequency contour is
displayed by a blue line in the window below the oscillogram, where the spectrogram is
supposed to be. On the right side of the window the frequency indications are given in blue. The
range of considered frequencies is, by default, from 75 Hz to 500 Hz (it is of course possible to
change these parameters: Pitch > Pitch settings)
- click on a point in the blue line corresponding to the period you used in order to calculate the
frequency (notice that just voiced segments have a pitch). What frequency is indicated in blue on
the right? (you can also go to Pitch > Get pitch). Does it correspond to the frequency you
calculated yourself?
- in order to get pitch information, select a portion of voiced speech and go to Pitch > Get
pitch: you get the average pitch in that selection.
- select a portion of voiced speech and go to Pitch > Get maximum pitch / Get minimum pitch:
you get the maximum and the minimum frequency in that portion
22
- select a portion of voiced speech and go to Pitch > Pitch listing: you get a list of frequency
values at discrete moments in time (every 0,01 seconds).
Pitch range settings: on the right side of the analysis window the range of considered
frequencies is indicated, by default it is from 75 to 500 Hz. This means that only the frequencies
within this range are considered. It is possible to change the settings (Pitch > Pitch settings >
Pitch range), according to what kind of speech one wants to analysis: male voice, for instance,
requires a lower pitch floor (and pitch ceiling) than female voice. The pitch floor determines the
duration of the analysis window [INSERT LINK]: a low pitch floor requires a longer analysis
window than a high pitch floor, and a longer analysis window is not able to get fast changes in
f0 (see Pitch > Pitch Settings > Help).
Different methods for detecting the fundamental frequency (pitch) are used. One of them is to
exploit the information about the distance (in Hz) between the harmonics, in a spectrum (see
figure 1.11 below). These algorithms may sometimes make mistakes; in particular, if portions of
pitch contour are visualised which strongly deviate from the rest, this is likely to be an error.
Often, the (erroneoulsy) identified pitch may be one octave higher (or lower) than the actual
pitch. This kind of error is an octave error, which depends on the adopted algorithm. It can be
adjusted by setting a lower pitch ceiling. Another typical mistake is in the detection of voice. If
the harmonics are not very evident, the speech fragment is considered unvoiced.
Figure 1.11
The F0 is obtained through the distance between the harmonics in the spectrum.
If there are no evident harmonics (on the right), the speech segment is considered voiceless.
By extracting the visible pitch contour of a signal, you get a pitch object in the Objects window:
- edit a pitch object by selecting it and going to Edit: you are now in the pitch editor, where the
digits (from 0 to 9) represent the pitch candidates (0 being the minimum goodness and 9 the
maximum). The pink disks make up a sort of stripe which represents the best path through the
candidates, in other words, the pitch contour. It is determined by a pitch-extraction algorithm but
it can be manually changed from the pitch editor
- listen to the pitch object by clicking on one of the the button bars at the bottom of the window
- at the bottom of the window (above the play buttom bars) a bar with blue (and white) rectangles
is displayed. This is the voicelessness bar: if there is no suitable pitch candidate in a given frame,
the frame is considered voiceless, which is represented by a blue rectangle: try and change the
path through the candidates manually, by clicking on them and selecting them in pink. Listen to
the object. By manually choosing pitch candidates that are more suitable than the ones who have
been selected by the algorithm, it is possible to remove octave errors
23
- remove your changes (Edit > Path finder)

- in order to make a frame voiceless, click on the voiceless bar. The reverse (making a voiceless
frame voiced) can be done by selecting a pitch candidate where the algorithm had chosen for
voicelessness. In this way it is possible to adjust mistakes in the detection of voice
- change the ceiling (Edit > Change ceiling). By lowering it, some formerly voiced frames may
become unvoiced, while by rising it some formerly unvoiced frames may become voiced.
In the last paragraph we saw how to draw an object in the picture window. All kinds of diagrams
provided by Praat may be plotted in this window: let us see now how to draw the pitch contour
in the picture window:
- Edit the file of sentence "the girl was sitting on the table"
- make the pitch contour Pitch > Extrat visible pitch contour
- from the objects window > Draw (plot them within the pink frame, one under the other, see
how to plot an object in the picture window
Intensity
We have just learnt how to deal with pitch in Praat, let us now take a look at intensity:
- edit the oscillogram (along with the spectrogram at the bottom of the window, which you can
also avoid visualising, as we did not intruduce it yet) of a speech file (girl, for instance)
- in order to display the intensity contour, go to View > Show analyses > Show intensity (or go
to Intensity in the menu > Show intensity): you will see the intensity contour displayed as a
yellow line. Compared to the pitch contour line, the intensity contour line is continue.
In the oscillogram panel on top, the maximum and minimum relative amplitude (from 0 to 1, in
black) values are displayed. If you click on a point in the oscillogram, the relative amplitude
corresponding to that point will be displayed in blue.
If you have chosen not to visualise the spectrogram (View > Show analyses), in the lower panel
you will see just the intensity contour (and, maybe, the pitch contour if you have chosen to
visualise it). The dB values for intensity are displayed in green, either on the right side of the
window or on its left side (depending on the visualisation of spectrogram and pitch contour, or
on their absence). The default range of considered decibels is 50 to 100 dB.
- click on the lowest and on the highest points of the intensity contour. Usually, the
corresponding intensity should be shown in green. In order to get information about intensity,
Go to Intensity > Get intensity in order to get the same values
- select a portion of speech and go to Intensity > Intensity listing. What do you see?
- select a portion of speech and go to Intensity > Get intensity, or look at the green dB values
displayed in the window. By default, you will get the mean energy in selection.
Intensity settings: Intensity > Intensity settings (see also Help): The default range for intensity
is 50 to 100 dB, but it is possible to change it. If you make a selection over time, the
correspondent dB values refer to an average. In the intensity settings it is possible to change the
averaging method (which by default is set to mean energy).
24
The Subtract mean pressure field, which is selected by default, computes the intensity by
subtracting the pressure level which recording systems may add to the air pressure (DC offset).
Drawing the intensity contour in the picture window:

- open the speech file with the sentence "the girl was sitting onthe table"
- extract the intensity contours of the sentences: Intensity > Extract visible intensity contour
- from the objects window > Draw (plot them within the pink frame, one under the other, see
how to draw an object in the picture window
The intensity of a speech signal is determined by computing the root mean square for every
window.
Segmenting a waveform (cutting, pasting)
It is possible to work on a waveform by cutting out segments from it or pasting new ones in:
- select an object from the Object window
- display its oscillogram (let us not consider the spectrogram now, remove it from the window
through View > Show analyses)
- select a portion of the oscillogram and remove it by going to Edit > Cut (try also the Set
selection to zero out)
- set the cursor at another point and go to Edit > Paste after selection
Labelling a waveform (Text grid)

You may want to label the waveform in order to clearly see how the different phones look like in
the oscillogram (but also in the spectrogram, which we did not focus on, yet):
- select an object from the Object window (for instance zinleven1,
in
http://www.let.uu.nl/~Hugo.Quene/personal/onderwijs/sprekenenverstaan/practicum20070725.zi
p)
- make sure just the oscillogram is displayed (go to Edit > View > Show analyses, untick
everything but Show oscillogram)
- go to the object window, Annotate > To TextGrid
- a window appears, where you are asked to enter categories for the Tier names: remove Mary
John bell and enter the categories you want to display such as words syllables phones, for
instance (enter these categories with spaces between each of them). While Tier names refers to
intervals, Point tiers refers to a list of names as point tiers (for labels of no duration). Let us
leave it as it is.
- a new object (TextGrid zinleven1) appears in the object window. By keeping the control button
on your keyboard pressed, select both the object zinleven1 and TextGrid zinleven1,
- go to Edit: a window will appear which displays the oscillogram of the sentence and three
horizontal fields: in the words field (which should be highlighted in yellow) you are supposed to
label the waveform with the corresponding words (identifying the word boundaries), and so in
the syllables and phones fields
- click on the oscillogram window in order to move the cursor
- by using the bars at the bottom of the window (first bar from the top makes you hear the file
from/up to the cursor position, the second bar makes you hear the visible segment, and the third
one the whole segment), listen to the sentence (or to segments of it) and find where the words
begin and end. Look also at the oscillogram in order to find a further aid
25
- once the cursor is in a word beginning (or ending) position, click on the blue dot on the cursor
line, in the field you are filling in (the words field). A red line will appear, as a word border. The
line turns blue if you click somewhere in the oscillogram panel. It is possible to displace the
border by dragging it with the mouse
- go on defining all the word boudaries within the sentence
- click somewhere within each of the word interval (between the two blue borders): the interval
turns yellow
- now type in the word which is pronounced (in the text field that appears at the top, under the
menu
- do the same with syllables and phones: you can also zoom into the oscillogram (through in or
sel, then out or all in order to zoom out).
Making a sine signal
We are already familiar with sine tones, which are periodic tones made up of just one frequency
component (see fig 1.6, which displays one period of a sine tone). We will never enconuter these
sounds in a natural environment, but they can be synthesized by a computer program. Let us see
how to make a sine signal with Praat:
- Go to New > Sound > Create sound from formula

- a window with the following fields will open: Name, Channels (Mono and Stereo, you can
leave it on mono, also while recording speech we kept the mono mode), Start and End time (s),
Sampling frequency (Hz), and finally, Formula
- the basic formula for a sine tone is
A * sin (2 * π * f * t + φ)
A = relative amplitude, from 0 (minimum) to 1 (maximum)
f = frequency (Hz)
t = time (x in Praat)
φ = phase angle (in radials)
- first of all, fill in the formula field with 1 * sin (2 * pi * 300 * x) (in Praat, π is pi). This is to
make a 300 Hz sine tone with a maximum relative amplitude
- in the paragraph about sampling frequency we have seen that a golden rule is to set this value at
least twice the highest frequency component in the signal. In a sine signal, just one frequency
component is present, so the sampling rate should be at least twice this component, but better if it
is still higher than that (10 times this component would be fine). You can thus lower the
sampling frequency by setting it to 3000 Hz.

Basics of Acoustics 1

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basics of Acoustics 1

Uploaded by

Copyright:

Available Formats

1

Some of the most commonly used audio file formats are:

Questions and exercises A.2:

amplitude resolution of 16 bits is used.

Questions and exercises A.3:

Questions and excercises A.4:

Questions and excercises A.5:

Questions and excercises A.7:

Recording sound on your computer √

Figure 1.6 shows the oscillogram of a (periodic) sine tone.

f = 1/T (with 1 referring to 1 second)

Root mean square amplitude

√ sum of squared amplitude values / number of the amplitude values

Energy [J] = Force [N] * Displacement [m]

Power [W] = Energy [J] / Time [s]

Intensity [W/m2] = Power [W] / Area [m2]

Intensity spreading from the signal source

Amplitude level and intensity level: decibel scales

Amplitude values Order of magnitude Powers of 10

log (RMS amplitude / reference amplitude) =

Lp = 10 * log (p / po)2 = 20 * log (p / po)

A doubling of the amplitude leads to an increase of the amplitude level of 6 dB, as

Phase position, φ (º) = (Position in period (s) * 360º) / T

Getting started with Praat

- open the speech file "girl" in Praat and make an oscillogram of it

- remove your changes (Edit > Path finder)

Drawing the intensity contour in the picture window:

Labelling a waveform (Text grid)

- Go to New > Sound > Create sound from formula

You might also like