You are on page 1of 38

Sound

Rawesak Tanawongsuwan
ccrtw@mahidol.ac.th
9

Sound fundamentals
• Created by vibrations from a guitar string or speaking

• Vibrations cause the air molecules near them to move and as a


result raising and lowering the air pressure slightly

• The wave reach our ears and we can hear the vibrations as sound

• A visual waveform is used to represent these pressure waves


9

Waveforms
• Amplitude: reflects the change in pressure from
the peak of the waveform to the trough
• Cycle: time for a waveform to go from one
amplitude until it reaches the same amplitude
again
• Frequency: the number of cycles per second (Hz)
 1,000 Hz goes through 1,000 cycles every
second
• Phase: measures how far through a cycle a
waveform is
• Wavelength: the distance between two points
with the same degree of phase
9

Waveforms
9

Waveforms
9 275–276

The Nature of Sound


• Conversion of energy
into vibrations in the
air (or some other
elastic medium)

• Most sound sources


vibrate in complex
ways leading to sounds
with components at
several different
frequencies
9

Analog audio
• The pressure waves of sound are converted into changes
in voltage on a wire of a microphone

• Changes in voltage match the pressure waves of the


original sound

• A speaker works in reverse by taking the voltage signals


from a microphone to re-create the pressure wave

Continuous measurement of pressure waves


9

Digital audio
• Computer store audio information as a series
of zeros and ones

• The waveform is broken into individual samples


(digitizing, sampling, analog-to-digital
conversion)

• Digitization means conversion to a stream of


numbers

• Signal must be sampled in time and amplitude


9

Sampling and Quantization


• Sampling means measuring the quantity we are interested
in, usually at evenly-spaced intervals

• For audio signal, we sample first in time which is called


sampling. The rate at which it is performed is called the
sampling frequency

• Quantization is sampling in the amplitude


9

Sampling rate
• The higher the sampling rate, the closer the shape of
the digital waveform to the original analog waveform

• Low sampling rates limit the range of frequencies that


can be recorded, result in poor representation of
original sound
9

Nyquist theorem
• Signals can be decomposed into a sum of sinusoids

• Nyquist theorem states how frequently we must


sample in time to be able to recover the original
signal

• For correct sampling we must use a sampling rate


equal to at least twice the maximum frequency
content in the signal

• This rate is called the Nyquist rate


9

Sampling rate
• If the audio contains frequencies as high as 8,000 Hz,
we need to sample at 16,000 samples per second

• Range of human hearing: roughly 20Hz-20kHz

• Sampling theorem implies minimum rate of 40kHz to


reproduce sound up to limit of hearing

• CDs have a sample rate of 44,100Hz or 44.1kHz

• Low bandwidth internet audio might require 22.05kHz

• DAT: 48kHz (mixing sound from CD and DAT will


require some resampling, best avoided)
9

Audio quality vs. Data rate

• Data rate and bandwidth in sample audio applications


9

Bit depth

• Sample rate determines the frequency resolution

• Bit depth determines the amplitude resolution

• Amplitude resolution is just as important as


frequency resolution

• Higher bit-depth means greater dynamic range,


higher fidelity

• When a waveform is sampled, each sample is


assigned the amplitude value closest to the original
analog waveform
9

Bit depth

• 2-bit depth  4 possible amplitude values

• 3-bit depth  8 possible amplitude values

• 8-bit depth  256  voice communication

• 16-bit depth  65,536  CD quality sound

• 24-bit depth  16,777,215  DVD quality sound


9

Data Size
• Sampling rate r is the number of samples per
second

• Sample size s bits

• Each second of digitized audio requires rs/8


bytes

• CD quality: r = 44100, s = 16, hence each


second requires just over 86 kbytes (k=1024),
each minute roughly 5Mbytes (mono)
9

Audio quality vs. Data rate

• Data rate and bandwidth in sample audio applications


9 289

Sound Editing
• Typical editing operations
• Trimming, combining, rearranging clips
• Lens itself to a timeline editing interface
• Timeline divided into tracks
• Sound on each track displayed as a waveform
• 'Scrub' over part of a track e.g. to find pauses
• Cut and paste, drag and drop
• May combine many tracks from different
recordings (mix-down) onto one (mono) or two
(stereo) tracks
9

Sound editing
• Loops
• Create a section of sound that represents
the sustained tone of an instrument i.e.
guitar
• Arbitrary long notes can be produced by
interpolating copies of the section between
samples
• Loops must be connected cleanly  no
abrupt discontinuities between its end and
start, otherwise audible clicks will occur
9

Let’s detour a bit…


9

2D texture synthesis
9

2D texture synthesis
9
2D Texture synthesis
9

Other domains?
• Video texture synthesis

• Animation from a single image

• Panorama + video texture synthesis

• 3D texture synthesis
9 290–295

Effects and Filters


• Noise gate
• Removal of unwanted noise, background noise
• Eliminates all samples whose value falls below a
specified threshold
• Low pass and high pass filters
• Remove certain bands of frequencies to remove
noise that falls within a specific frequency range
• Notch filter
• Removes a single narrow frequency band
• Common use is to remove hum picked up from
the mains (frequency of exactly 50 or 60 Hz)
9

Effects and Filters


• De-esser
• Remove the sibilance that results from speaking
or singing into a microphone placed too close to
the performer
• Click repairer
• Remove clicks from recordings taken from
damanged or dirty vinyl records
• Reverb
• Adding copies of a signal, delayed in time and
attenuated, to the original
• Model reflections from surrounding space
9

Effects and Filters


• Graphic equalizer
• Transforms the spectrum of a sound using a
bank of filters
• Each controlled by its own slider and each
affecting a narrow band of frequencies
• Envelope Shaping
• Change the outline of a waveform
• Faders allow a sound’s volume to be gradually
increased or decreased
• Pitch alteration and time stretching
• Sound is synchronized to video or another sound
• Alter pitch of an instrument
9 295

Audio Compression
• In general, lossy methods required because of
complex and unpredictable nature of audio
data

• CD quality, stereo, 3-minute song requires


over 25 Mbytes

• Data rate exceeds bandwidth of dial-up


Internet connection

• Difference in the way we perceive sound and


image means different approach from image
compression is needed
9 300–301

MP3
• MPEG Audio, Layer 3
• Three layers of audio compression in MPEG-1
(MPEG-2 essentially identical)
• Layer 1...Layer 3, encoding proces increases in
complexity, data rate for same quality
decreases
• e.g. Same quality 192kbps at Layer 1,
128kbps at Layer 2, 64kbps at Layer 3
• 10:1 compression ratio at high quality
• Variable bit rate coding (VBR)
9 301

AAC
• Advanced Audio Coding
• At the core of the MPEG4, 3GPP and 3GPP2
specifications

• Not backward compatible with earlier standards

• Higher compression ratios and lower bit rates


than MP3

• Subjectively better quality than MP3 at the same


bit rate

• Delivers quality rivaling that of uncompressed


CD audio
9

AAC advantages over MP3

• Improved compression provides higher-quality


results with smaller file sizes

• Support for multichannel audio, providing up to


48 full frequency channels

• Higher resolution audio, yielding sampling


rates up to 96 kHz

• Improved decoding efficiency, requiring less


processing power for decode
9 302

Audio Formats
• Platform-specific file formats

• AIFF (MacOS), WAV (Windows), AU (Unix)

• Multimedia formats used as 'container formats'


for sound compressed with different codecs

• QuickTime, Windows Media, RealAudio

• MP3 has its own file format, but MP3 data can
be included as audio tracks in QuickTime
movies and SWFs
9 303–304

MIDI
• Musical Instruments Digital Interface
• Instructions about how to produce music,
which can be interpreted by suitable hardware
and/or software cf. vector graphics as drawing
instructions
• Standard protocol for communicating between
electronic instruments (synthesizers, samplers,
drum machines)
• Allows instruments to be controlled by
hardware or software sequencers
9

MIDI and digital audio


• MIDI and digital audio are fundamentally different
• Digital audio is a digital representation of a
sound wave
• MIDI is a language of instructions for musical
instruments
• Digital audio file seeks to exactly represent an
audio event just like a tape recorder (can be a
musical performance, person talking, any other
sound)
9

MIDI and digital audio


• MIDI and digital audio are fundamentally different
• MIDI is more like sheet music  acts as
instructions for the re-creation of a musical
selection
• When a MIDI file is played back, the sound card
takes this information and uses its synthesizer to
re-create the note on the right instrument
• MIDI file may sound different depending on what
sound card plays it back
• MIDI file cannot record sounds that cannot be re-
synthesized from short instructions, such as
human voice
9 304

MIDI and Computers


• MIDI interface allows computer to send MIDI
data to instruments
• Store MIDI sequences in files, exchange them
between computers, incorporate into
multimedia
• Computer can synthesize sounds on a sound
card, or play back samples from disk in
response to MIDI instructions
• Computer becomes primitive musical
instrument (quality of sound inferior to
dedicated instruments)
9 305

MIDI Messages
• Instructions that control some aspect of the
performance of an instrument

• Status byte – indicates type of message

• 2 data bytes – values of parameters

• e.g. Note On + note number (0..127) + key


velocity

• Running status – omit status byte if it is the


same as preceding one
9 306

General MIDI
• Synths and samplers provide a variety of
voices
• MIDI Program Change message selects a new
voice, but mapping from values to voices is not
defined in the MIDI standard
• General MIDI (addendum to standard) specifies
128 standard voices for Program Change
values
• Actually GM specifies voice names, no
guarantee that identical sounds will be
produced on different instruments

You might also like