You are on page 1of 7

own visualizations or analysis algorithms; in short enable (e.g. zooming, rotation, etc.

); (3) audiovisual synchroni-


The Sound Analysis Toolbox (SATB) users to engage in analysis of all types of music/audio. zation: the data that is being explored should not only be
In our current version of SATB, we have improved ex- subject to efficient and quick visualization but also seam-
isting EASY modules, added new modules, and created less audiovisual exploration so that audio playback is
Tae Hong Park Sumanth Srinivasan designs that allow for more flexible, customizable, and a synchronized with “plots” and “subplots”; and (4) extend-
Music Technology and Composition Electrical and Computer Engineering MATLAB-style interaction platform that we hope will ible analysis API: users should be able to use baseline
NYU Steinhardt NYU Tandon seem familiar to MATLAB users and users of other audio analysis tools such as standard feature extraction algo-
research tools. A summary of the SATB system follows a rithms and classification algorithms and also use our
thp1@nyu.edu sumanths@nyu.edu
brief survey of sound analysis systems that are currently APIs to straightforwardly add and contribute custom al-
used today. gorithms as needed. This includes addressing issues con-
cerning customization, contribution to the research com-
ABSTRACT music analysis model where both objective and subjec- 1.1 Audio Analysis Tools Examples munity, and easy integration into SATB whereby com-
tive approaches for electro-acoustic music analyzing plexities such as I/O, visualization, and data exploration
Sound analysis software applications have become com- played significant roles [2]. There is substantial amount of on-going research and are handled behind the scenes by the system. These main
monplace for exploring music and audio, and important While trying to improve EASY, we to began to recog- contributions in the field of audio-analysis and music design components are further summarized in greater
factors including responsive/fast data visualization, flexi- nize a number of design shortcomings, including: (1) information retrieval (MIR), most of which have ap- detail in the following sub-sections.
ble code development capabilities, availability of stand- specificity and generality: from a technical point of view, proached music analysis from a traditional standpoint –
ard/customizable libraries/modules, and the existence of the EASY Toolbox was narrow in scope in that it was offering analysis outputs such as rhythm analysis, pitch 2.1 Making a splot: Responsive plotting
large community of developers have likewise become being developed specifically for electro-acoustic music. and harmony analysis, and genre classification to name a
integral. The widely used MATLAB software, in particu- An important design philosophy in EASY was to follow a few. Sonic Visualiser [5], for example, provides a wealth Large vectors and large files – if they can be loaded into
lar, has played an important role as an all-purpose audio timbral approach to electro-acoustic music analysis, of visualizations for audio signals as well as an interface the MATLAB workspace at all – are notoriously cumber-
exploration and research tool. However, its flexibility which we thought too rigid in scope; (2) Analysis Module some to display and interact with MATLAB’s go-to plot
for sound annotation. It also includes a feature extraction
and practicality when exploring large audio data, its lim- API: the EASY Toolbox was implemented using a set of function. Additionally, although the MATLAB sound-
plug-in system for customization possibilities.
itations for synchronized audiovisual exploration, and its feature analysis modules (feature vector types or classifi- player can be used to play audio data (again, if small
Wavesurfer [6] focuses on speech analysis and provides
deficiencies as an integrated system for audio research is cation algorithms) without an API, making third-party enough for its workspace), there are no built-in features
an area that can be improved. In this paper we report on spectrogram visualization while the Python-based
contribution, or additional module development cumber- LibROSA library offers a framework with building blocks that provide synchronous audiovisual interaction with
developments on the Sound Analysis Toolbox (SATB), a some; and (3) flexible audio-synchronized visualization:
to construct MIR systems. pyAudioAnalysis [7] is an open data. SATB’s splot addresses shortcomings of these
pure MATLAB-based toolbox that addresses some of although many of pre-defined EASY visualizations
source Python library that additionally offers speaker essential features for sound, audio, and music explora-
MATLAB’s basic deficiencies as an audio research plat- proved to be insightful, as the EASY Toolbox’s visuali-
diarization and classification capabilities. While Python tion, and furthermore looks and feels the same as
form. We introduce solutions including efficient visuali- zation tools were not flexible enough to allow customiza-
zation for literally any sized data, a simple feature ex- is a useful platform for application-centric tools, rapid MATLAB’s plot function … but with added functionali-
tion, we found its utility limiting.
traction “plug-in” API, and the sMAT Listener module prototyping and a research-centric approach are still ty. SATB’s splot is essentially a custom, audio-signal-
for spatiotemporal audio-visual exploration. somewhat cumbersome in that a unified research envi- friendly “upgrade” of MATLAB’s plot. splot (or
ronment is not always available. MIRToolbox [8] is “SATB plot”) enables users to quickly display and inter-
1. INTRODUCTION MATLAB library offers a set of functions for feature ex- act with plots while having access to all of the standard
traction such as spectral centroid, tonality, rhythm etc. plot options such subplot, hold, legend, as well
The Sound Analysis Toolbox (SATB) project’s origin can from audio files, focusing heavily on processing of music as other plot options that MATLAB users would expect
be traced to the EASY Toolbox. EASY project started as in terms of its pitch-duration lattice as opposed to more to be able to use. splot utilizes a simple but effective
an effort to embrace music information retrieval (MIR) generic audio signals. The Chroma Toolbox provides algorithm developed as part of an iOS DAW project
for electro-acoustic music analysis by observing its popu- called microDAW1 and is based on (1) strategically plot-
implementations for extracting variants of chroma-based
larity within the traditional tonal/rhythm research com-
features [7] and others are focus on similarity analysis ting an approximation of a large vector by considering
munity. EASY included a number of features including
[8]. While all of the aforementioned software is useful the limited pixels available on digital canvases, (2) stra-
implementation of 27 feature extraction algorithms as
and sophisticated in their own ways, they are also frag- tegic re-computation of new estimations of signal por-
well as a basic classification module to facilitate the idea
of utilizing both qualitative and quantitative approaches mented where some lack important yet basic features tions to be displayed during zoom requests, and (3) ex-
for interpreting electro-acoustic music [1]. The research such as: (1) audiovisual synchronous playback, (2) fea- ploiting how humans roughly visually perceive large au-
emphasis in exploring electro-acoustic music analysis ture extraction and customizability options, (3) coding dio signals when displayed with limited resources on
techniques from a quantitative approach was in part, due environment, and (4) visualization flexibility. computer monitors – i.e. pixels. In essence, the algorithm
to the observation of the genre itself, where an emphasis SATB aims to contribute and attempts to consolidate a down-samples the original vector by analyzing windowed
of non-traditional musical parameters, commonly outside number of the important fundamental features i.e., fast portions of the vector that correspond to the computer’s
of the realm of melody, pitch structures, harmony, visualization, general coding platform, feature extraction canvas pixel width, computing the min and max values
Figure 1. Timbregram: audiovisual exploration of bass,
rhythm, and pulse are commonplace. As such, a number clarinet, and French horn samples and classification APIs, while providing a responsive for each window, and preserving temporal order as
of visualizations, including the timbregram were devel- interface in the MATLAB environment. shown below where n is the argument and sample index
oped as shown in Figure 1. The timbregram and other Considering design limitations of the EASY Toolbox, and x[n] is the value at sample index n.
visualizations essentially offered a low-level acoustic we discontinued its development and began developing
SATB [3]. This included broader design philosophies that
2. SATB arg min 𝑥𝑥𝑥𝑥 𝑛𝑛𝑛𝑛 (1)
descriptor approach for electro-acoustic music explora- !
tion and analysis in addition to traditional waveforms and would facilitate a more general approach to quantitative SATB is based on a number of fundamental design phi-
spectrograms. This was primarily accomplished by map- music and sound analysis. In particular, as one of our losophies including (1) familiarity: the user should find arg max 𝑥𝑥𝑥𝑥 𝑛𝑛𝑛𝑛 (2)
!
ping, and assigning feature clusters to various 3D visuali- current research is in Soundscape Information Retrieval SATB familiar when viewed from the MATLAB user-
zation formats. The goal of EASY was to begin exploring (SIR) [3], we have come to embrace a more modular ap- ecosystem; (2) fast and responsive visualization: users
the potential of applying both quantitative and qualitative proach to tool development with the creation of “low- should be able to quickly “plot” (or splot in our case)
analyses paradigms, espousing a more comprehensive level” analysis tools to facilitate users to customize their large data and allow responsive interaction with the data 1
http://www.suitecat.com

235
234 Proceedings of the International Computer Music Conference 2016 Proceedings of the International Computer Music Conference 2016 236
own visualizations or analysis algorithms; in short enable (e.g. zooming, rotation, etc.); (3) audiovisual synchroni-
The Sound Analysis Toolbox (SATB) users to engage in analysis of all types of music/audio. zation: the data that is being explored should not only be
In our current version of SATB, we have improved ex- subject to efficient and quick visualization but also seam-
isting EASY modules, added new modules, and created less audiovisual exploration so that audio playback is
Tae Hong Park Sumanth Srinivasan designs that allow for more flexible, customizable, and a synchronized with “plots” and “subplots”; and (4) extend-
Music Technology and Composition Electrical and Computer Engineering MATLAB-style interaction platform that we hope will ible analysis API: users should be able to use baseline
NYU Steinhardt NYU Tandon seem familiar to MATLAB users and users of other audio analysis tools such as standard feature extraction algo-
research tools. A summary of the SATB system follows a rithms and classification algorithms and also use our
thp1@nyu.edu sumanths@nyu.edu
brief survey of sound analysis systems that are currently APIs to straightforwardly add and contribute custom al-
used today. gorithms as needed. This includes addressing issues con-
cerning customization, contribution to the research com-
ABSTRACT music analysis model where both objective and subjec- 1.1 Audio Analysis Tools Examples munity, and easy integration into SATB whereby com-
tive approaches for electro-acoustic music analyzing plexities such as I/O, visualization, and data exploration
Sound analysis software applications have become com- played significant roles [2]. There is substantial amount of on-going research and are handled behind the scenes by the system. These main
monplace for exploring music and audio, and important While trying to improve EASY, we to began to recog- contributions in the field of audio-analysis and music design components are further summarized in greater
factors including responsive/fast data visualization, flexi- nize a number of design shortcomings, including: (1) information retrieval (MIR), most of which have ap- detail in the following sub-sections.
ble code development capabilities, availability of stand- specificity and generality: from a technical point of view, proached music analysis from a traditional standpoint –
ard/customizable libraries/modules, and the existence of the EASY Toolbox was narrow in scope in that it was offering analysis outputs such as rhythm analysis, pitch 2.1 Making a splot: Responsive plotting
large community of developers have likewise become being developed specifically for electro-acoustic music. and harmony analysis, and genre classification to name a
integral. The widely used MATLAB software, in particu- An important design philosophy in EASY was to follow a few. Sonic Visualiser [5], for example, provides a wealth Large vectors and large files – if they can be loaded into
lar, has played an important role as an all-purpose audio timbral approach to electro-acoustic music analysis, of visualizations for audio signals as well as an interface the MATLAB workspace at all – are notoriously cumber-
exploration and research tool. However, its flexibility which we thought too rigid in scope; (2) Analysis Module some to display and interact with MATLAB’s go-to plot
for sound annotation. It also includes a feature extraction
and practicality when exploring large audio data, its lim- API: the EASY Toolbox was implemented using a set of function. Additionally, although the MATLAB sound-
plug-in system for customization possibilities.
itations for synchronized audiovisual exploration, and its feature analysis modules (feature vector types or classifi- player can be used to play audio data (again, if small
Wavesurfer [6] focuses on speech analysis and provides
deficiencies as an integrated system for audio research is cation algorithms) without an API, making third-party enough for its workspace), there are no built-in features
an area that can be improved. In this paper we report on spectrogram visualization while the Python-based
contribution, or additional module development cumber- LibROSA library offers a framework with building blocks that provide synchronous audiovisual interaction with
developments on the Sound Analysis Toolbox (SATB), a some; and (3) flexible audio-synchronized visualization:
to construct MIR systems. pyAudioAnalysis [7] is an open data. SATB’s splot addresses shortcomings of these
pure MATLAB-based toolbox that addresses some of although many of pre-defined EASY visualizations
source Python library that additionally offers speaker essential features for sound, audio, and music explora-
MATLAB’s basic deficiencies as an audio research plat- proved to be insightful, as the EASY Toolbox’s visuali-
diarization and classification capabilities. While Python tion, and furthermore looks and feels the same as
form. We introduce solutions including efficient visuali- zation tools were not flexible enough to allow customiza-
zation for literally any sized data, a simple feature ex- is a useful platform for application-centric tools, rapid MATLAB’s plot function … but with added functionali-
tion, we found its utility limiting.
traction “plug-in” API, and the sMAT Listener module prototyping and a research-centric approach are still ty. SATB’s splot is essentially a custom, audio-signal-
for spatiotemporal audio-visual exploration. somewhat cumbersome in that a unified research envi- friendly “upgrade” of MATLAB’s plot. splot (or
ronment is not always available. MIRToolbox [8] is “SATB plot”) enables users to quickly display and inter-
1. INTRODUCTION MATLAB library offers a set of functions for feature ex- act with plots while having access to all of the standard
traction such as spectral centroid, tonality, rhythm etc. plot options such subplot, hold, legend, as well
The Sound Analysis Toolbox (SATB) project’s origin can from audio files, focusing heavily on processing of music as other plot options that MATLAB users would expect
be traced to the EASY Toolbox. EASY project started as in terms of its pitch-duration lattice as opposed to more to be able to use. splot utilizes a simple but effective
an effort to embrace music information retrieval (MIR) generic audio signals. The Chroma Toolbox provides algorithm developed as part of an iOS DAW project
for electro-acoustic music analysis by observing its popu- called microDAW1 and is based on (1) strategically plot-
implementations for extracting variants of chroma-based
larity within the traditional tonal/rhythm research com-
features [7] and others are focus on similarity analysis ting an approximation of a large vector by considering
munity. EASY included a number of features including
[8]. While all of the aforementioned software is useful the limited pixels available on digital canvases, (2) stra-
implementation of 27 feature extraction algorithms as
and sophisticated in their own ways, they are also frag- tegic re-computation of new estimations of signal por-
well as a basic classification module to facilitate the idea
of utilizing both qualitative and quantitative approaches mented where some lack important yet basic features tions to be displayed during zoom requests, and (3) ex-
for interpreting electro-acoustic music [1]. The research such as: (1) audiovisual synchronous playback, (2) fea- ploiting how humans roughly visually perceive large au-
emphasis in exploring electro-acoustic music analysis ture extraction and customizability options, (3) coding dio signals when displayed with limited resources on
techniques from a quantitative approach was in part, due environment, and (4) visualization flexibility. computer monitors – i.e. pixels. In essence, the algorithm
to the observation of the genre itself, where an emphasis SATB aims to contribute and attempts to consolidate a down-samples the original vector by analyzing windowed
of non-traditional musical parameters, commonly outside number of the important fundamental features i.e., fast portions of the vector that correspond to the computer’s
of the realm of melody, pitch structures, harmony, visualization, general coding platform, feature extraction canvas pixel width, computing the min and max values
Figure 1. Timbregram: audiovisual exploration of bass,
rhythm, and pulse are commonplace. As such, a number clarinet, and French horn samples and classification APIs, while providing a responsive for each window, and preserving temporal order as
of visualizations, including the timbregram were devel- interface in the MATLAB environment. shown below where n is the argument and sample index
oped as shown in Figure 1. The timbregram and other Considering design limitations of the EASY Toolbox, and x[n] is the value at sample index n.
visualizations essentially offered a low-level acoustic we discontinued its development and began developing
SATB [3]. This included broader design philosophies that
2. SATB arg min 𝑥𝑥𝑥𝑥 𝑛𝑛𝑛𝑛 (1)
descriptor approach for electro-acoustic music explora- !
tion and analysis in addition to traditional waveforms and would facilitate a more general approach to quantitative SATB is based on a number of fundamental design phi-
spectrograms. This was primarily accomplished by map- music and sound analysis. In particular, as one of our losophies including (1) familiarity: the user should find arg max 𝑥𝑥𝑥𝑥 𝑛𝑛𝑛𝑛 (2)
!
ping, and assigning feature clusters to various 3D visuali- current research is in Soundscape Information Retrieval SATB familiar when viewed from the MATLAB user-
zation formats. The goal of EASY was to begin exploring (SIR) [3], we have come to embrace a more modular ap- ecosystem; (2) fast and responsive visualization: users
the potential of applying both quantitative and qualitative proach to tool development with the creation of “low- should be able to quickly “plot” (or splot in our case)
analyses paradigms, espousing a more comprehensive level” analysis tools to facilitate users to customize their large data and allow responsive interaction with the data 1
http://www.suitecat.com

234 Proceedings of the International Computer Music Conference 2016 Proceedings of the International Computer Music Conference 2016 235
SATB internally stores the min-max down-sampled vec- check syntax and errors for standard plot options. conds required for reading the data, displaying the data, The responsiveness is not only observed when first plot-
tor which itself is stored in an instance of the SATB’s MATLAB’s subplots feature is also integrated into and making splot/plot responsive to user interaction. ting a signal, but is even more notable when interaction
sFig (or a “SATB” figure) allowing effective splot by using a dynamically changing global down- While plot is minimally faster (in order of millisec- with the plotted data – when zooming, rotating, etc. All
memory management (most data types are references via sampling rates when multiple plots are requested. Here onds) for files with short duration, substantial savings in benchmarking was done with following hardware and
handles to minimize unnecessary resources usage). all vectors in a figure’s subplot are analyzed to com- setup time was observed when using splot. Efficiency software: MacBook Pro (13-inch, Mid 2012), 2.9G Hz
Zooming into the vector is efficiently implemented by pute the global decimation ratio, where the subplot was observed to be proportional to the number of sam- Core Intel i7, 8GB 1600 MHz DDR3, Intel HD Graphics
considering when to re-compute the requested zoom re- ples, and consequently, duration of the signal. Figure 3 4000 1024 MB, in MATLAB Version: 8.4.0.150421
with the largest vector size is selected at each subplot
quest of the vector and when to simply scale the canvas shows splot (solid line) and plot performance as sig- (R2015a).
request. This ensures that all subplots are formatted with
with the existing down-sampled vector that is already same decimation ratio effectively resulting in “apple vs. nal size is increased in one-minute increments up to 20
minutes: x axis is signal duration (min) and y axis plotting 2.1.4 More than splotting: Audiovisual synchronization
plotted (resampling vs. use of xlim). That is, re-compute apple” visualization.
duration in seconds. The current iteration of splot also includes a basic au-
the “envelope” only when the user requests less than half
of the original vectors. We have empirically found that dio transport feature where plotted signals can be played
2.1.1 Plotting multidimensional vectors
The reader will note that in Figure 3, splot is approxi- back and synchronized via a dynamically updating cursor
for zoom requests that are larger than 50% of the vector For multidimensional vectors such as STFT spectro-
mately 300% times faster than plot for a minute sinus- to synchronize audio and visuals. This is achieved using
size, the visual difference between a down-sampled vec- grams, for example, a similar min-max, down-sampling
oidal signal. Similar benefits are shown for Gaussian MATLAB DSP System Toolbox and the
tor and the original vector is practically indistinguishable. algorithm is employed. Instead of down-sampling a one- white noise signals. Table 1 shows display performance dsp.AudioPlayer, step method, audio queuing,
This allows for each canvas to only plot a maximum of dimensional “line,” in the case of the spectrogram, a rec- for different types of musical signals where again, similar
twice the width of the computer’s display width in pixels, and various customizable latency and audio buffer set-
tangular 2D area is analyzed for min/max arguments in advantages can be seen in splot performance over
which makes rotation, zooming in/out or adding sub-plots tings to synchronize audio and visualizations. Current
two dimensions – time and frequency indexes in the case plot (388% faster for Stria).
efficient, effective, and extremely responsive. When audio transport and audio control features include play-
of STFT displays. However, any vector with two dimen-
zooming into the level at or below the canvas size, down- back, rewind to start of vector, stop/pause, audio play-
sions can be plotted and splot simply analyzes 1D or
sampling is bypassed, and requested samples (< canvas back sampling rate change, and soloing a subplot for
2D data.
width size) are displayed directly as shown in Figure 2. playback. Additionally, the SATB interface also provides
audio scrubbing, that systems like Avid Protools and oth-
2.1.2 Plotting vectors and files
er DAWs include. Scrubbing is achieved by simply drag-
splot can handle a number of different data formats.
ging the cursor of a subplot as shown in Figure 5. When
Vectors already in the MATLAB workspace can simply be
the cursor is released, playback resumes at the timestamp
plotted using the exact same syntax used in plot. Addi-
corresponding to where mouse button release occurred.
tionally, splot can also display files not in MATLAB’s
For vectors that do not have an associated sampling rate a
workspace including audio files (all audio file extensions
default value of 44.1 kHz is assigned (sampling rate can
that are recognized in MATLAB’s audioread), binary
be provided as input argument through the splot input
files (user will have to provide bit depth and vector di-
formatted as MATLAB cell array {…}).
mensional information as a separate cell array input
argument (e.g. splot({‘fs’, 8000})), or files
mapped via MATLAB’s memmapfile (memory map to a
file). The threshold for using SATB’s down-sampling
feature is customizable and is set to 2 million samples by
default.
Figure 3. splot and plot load times for sine sig-
Title Music plot splot nals
Beatles 2:21 0.8797 0.5327
Figure 2. splot zoom in canvas level Queen 3:36 0.7847 0.6554
Radiohead 6:23 1.2380 0.7852
Zooming out (double-click as commonly done in Coltrane 13:39 3.6341 1.2259
MATLAB) to its original full-vector overview is instanta- Chowning 17:03 5.4389 1.4146
neous as we store the fully zoomed-out and down-
Table 1. Load time of audio data at various zoom levels Figure 5. Audio-scrubbing in SATB
sampled vector in a given MATLAB axes. Instantane-
ous zoom-out is equivalent to the size of the initial down-
2.1.3 splot benchmarking 2.2 Feature Extraction Module
sampled vector: this compressed approximation of the
signal under consideration is very compact and stored in Table 1 shows benchmarking results for splot SATB’s analysis module currently implements 17
the sFigure. It is only twice the size of the user’s com- MATLAB’s plot function. Results for a number of dif- time/frequency-domain low-level feature descriptors in-
puter monitor pixel width by default. We have found ferent audio files (at sampling rate 44.1 kHz) such as cluding RMS, attack time, crest factor, dynamic tightness
“oversampling” by factor of two worked well for efficient classic compositions including Help! (The Beatles), Par- [9], low energy ratio, pitch, temporal centroid, zero-
zoom performance (this oversampling factor, however, is anoid Android (Radiohead), My Favorite Things (John crossing rate, MFCC, spectral centroid, spectral flux,
customizable). Coltrane), and Stria (John Chowning) are shown. spectral jitter, spectral roll-off, spectral shimmer, spectral
Using subplot and other options such as hold, Benchmarking tests were also conducted using sinusoid spread, spectral flatness, and spectral smoothness.
line color, and line style are also seamlessly integrated and Gaussian white noise signals of varied durations to SATB’s analysis has been designed by considering im-
into splot by using try-catch statements which examine performance of our min-max algorithm for dec- portant factors for audio/music analysis environments,
imation. The following figures show performance com- Figure 4. splot and plot load times for noise
bypasses the need for any custom error checking code in including: (1) data size flexibility: analyzing, processing,
SATB – we simply use plot’s error checking feature to parison of plot and splot functions: duration in se- and storing results, (2) extendibility and API: ease of add-

237
236 Proceedings of the International Computer Music Conference 2016 Proceedings of the International Computer Music Conference 2016 238
SATB internally stores the min-max down-sampled vec- check syntax and errors for standard plot options. conds required for reading the data, displaying the data, The responsiveness is not only observed when first plot-
tor which itself is stored in an instance of the SATB’s MATLAB’s subplots feature is also integrated into and making splot/plot responsive to user interaction. ting a signal, but is even more notable when interaction
sFig (or a “SATB” figure) allowing effective splot by using a dynamically changing global down- While plot is minimally faster (in order of millisec- with the plotted data – when zooming, rotating, etc. All
memory management (most data types are references via sampling rates when multiple plots are requested. Here onds) for files with short duration, substantial savings in benchmarking was done with following hardware and
handles to minimize unnecessary resources usage). all vectors in a figure’s subplot are analyzed to com- setup time was observed when using splot. Efficiency software: MacBook Pro (13-inch, Mid 2012), 2.9G Hz
Zooming into the vector is efficiently implemented by pute the global decimation ratio, where the subplot was observed to be proportional to the number of sam- Core Intel i7, 8GB 1600 MHz DDR3, Intel HD Graphics
considering when to re-compute the requested zoom re- ples, and consequently, duration of the signal. Figure 3 4000 1024 MB, in MATLAB Version: 8.4.0.150421
with the largest vector size is selected at each subplot
quest of the vector and when to simply scale the canvas shows splot (solid line) and plot performance as sig- (R2015a).
request. This ensures that all subplots are formatted with
with the existing down-sampled vector that is already same decimation ratio effectively resulting in “apple vs. nal size is increased in one-minute increments up to 20
minutes: x axis is signal duration (min) and y axis plotting 2.1.4 More than splotting: Audiovisual synchronization
plotted (resampling vs. use of xlim). That is, re-compute apple” visualization.
duration in seconds. The current iteration of splot also includes a basic au-
the “envelope” only when the user requests less than half
of the original vectors. We have empirically found that dio transport feature where plotted signals can be played
2.1.1 Plotting multidimensional vectors
The reader will note that in Figure 3, splot is approxi- back and synchronized via a dynamically updating cursor
for zoom requests that are larger than 50% of the vector For multidimensional vectors such as STFT spectro-
mately 300% times faster than plot for a minute sinus- to synchronize audio and visuals. This is achieved using
size, the visual difference between a down-sampled vec- grams, for example, a similar min-max, down-sampling
oidal signal. Similar benefits are shown for Gaussian MATLAB DSP System Toolbox and the
tor and the original vector is practically indistinguishable. algorithm is employed. Instead of down-sampling a one- white noise signals. Table 1 shows display performance dsp.AudioPlayer, step method, audio queuing,
This allows for each canvas to only plot a maximum of dimensional “line,” in the case of the spectrogram, a rec- for different types of musical signals where again, similar
twice the width of the computer’s display width in pixels, and various customizable latency and audio buffer set-
tangular 2D area is analyzed for min/max arguments in advantages can be seen in splot performance over
which makes rotation, zooming in/out or adding sub-plots tings to synchronize audio and visualizations. Current
two dimensions – time and frequency indexes in the case plot (388% faster for Stria).
efficient, effective, and extremely responsive. When audio transport and audio control features include play-
of STFT displays. However, any vector with two dimen-
zooming into the level at or below the canvas size, down- back, rewind to start of vector, stop/pause, audio play-
sions can be plotted and splot simply analyzes 1D or
sampling is bypassed, and requested samples (< canvas back sampling rate change, and soloing a subplot for
2D data.
width size) are displayed directly as shown in Figure 2. playback. Additionally, the SATB interface also provides
audio scrubbing, that systems like Avid Protools and oth-
2.1.2 Plotting vectors and files
er DAWs include. Scrubbing is achieved by simply drag-
splot can handle a number of different data formats.
ging the cursor of a subplot as shown in Figure 5. When
Vectors already in the MATLAB workspace can simply be
the cursor is released, playback resumes at the timestamp
plotted using the exact same syntax used in plot. Addi-
corresponding to where mouse button release occurred.
tionally, splot can also display files not in MATLAB’s
For vectors that do not have an associated sampling rate a
workspace including audio files (all audio file extensions
default value of 44.1 kHz is assigned (sampling rate can
that are recognized in MATLAB’s audioread), binary
be provided as input argument through the splot input
files (user will have to provide bit depth and vector di-
formatted as MATLAB cell array {…}).
mensional information as a separate cell array input
argument (e.g. splot({‘fs’, 8000})), or files
mapped via MATLAB’s memmapfile (memory map to a
file). The threshold for using SATB’s down-sampling
feature is customizable and is set to 2 million samples by
default.
Figure 3. splot and plot load times for sine sig-
Title Music plot splot nals
Beatles 2:21 0.8797 0.5327
Figure 2. splot zoom in canvas level Queen 3:36 0.7847 0.6554
Radiohead 6:23 1.2380 0.7852
Zooming out (double-click as commonly done in Coltrane 13:39 3.6341 1.2259
MATLAB) to its original full-vector overview is instanta- Chowning 17:03 5.4389 1.4146
neous as we store the fully zoomed-out and down-
Table 1. Load time of audio data at various zoom levels Figure 5. Audio-scrubbing in SATB
sampled vector in a given MATLAB axes. Instantane-
ous zoom-out is equivalent to the size of the initial down-
2.1.3 splot benchmarking 2.2 Feature Extraction Module
sampled vector: this compressed approximation of the
signal under consideration is very compact and stored in Table 1 shows benchmarking results for splot SATB’s analysis module currently implements 17
the sFigure. It is only twice the size of the user’s com- MATLAB’s plot function. Results for a number of dif- time/frequency-domain low-level feature descriptors in-
puter monitor pixel width by default. We have found ferent audio files (at sampling rate 44.1 kHz) such as cluding RMS, attack time, crest factor, dynamic tightness
“oversampling” by factor of two worked well for efficient classic compositions including Help! (The Beatles), Par- [9], low energy ratio, pitch, temporal centroid, zero-
zoom performance (this oversampling factor, however, is anoid Android (Radiohead), My Favorite Things (John crossing rate, MFCC, spectral centroid, spectral flux,
customizable). Coltrane), and Stria (John Chowning) are shown. spectral jitter, spectral roll-off, spectral shimmer, spectral
Using subplot and other options such as hold, Benchmarking tests were also conducted using sinusoid spread, spectral flatness, and spectral smoothness.
line color, and line style are also seamlessly integrated and Gaussian white noise signals of varied durations to SATB’s analysis has been designed by considering im-
into splot by using try-catch statements which examine performance of our min-max algorithm for dec- portant factors for audio/music analysis environments,
imation. The following figures show performance com- Figure 4. splot and plot load times for noise
bypasses the need for any custom error checking code in including: (1) data size flexibility: analyzing, processing,
SATB – we simply use plot’s error checking feature to parison of plot and splot functions: duration in se- and storing results, (2) extendibility and API: ease of add-

236 Proceedings of the International Computer Music Conference 2016 Proceedings of the International Computer Music Conference 2016 237
ing additional, custom feature extraction modules, (3) and development of additional feature extraction algo- mapped in 3D space. The listening spot is a 3D observa- database exploration/querying modules for two databases
visualization: options for adding custom/specialized visu- rithms. SATB’s “plug-in” development architecture is tion coordinate that can be freely moved around the – Freesound2 and Citygram3. This will allow for easy
alization for any feature extraction module, and (4) data straightforward in that it inherits all necessary methods sMAT space simulating a virtual, “on-stage” listening access to databases including downloading of audio data,
management: using handles/references whenever possible from its analysis superclass and handles appropriate in- experience: selecting and moving the listening spot labels, and other metadata – directly from MATLAB. This
to minimize system resources. put/output vector passing to and from each feature extrac- around the 3D space allows, for example, to “eavesdrop” feature will be integrated with our sound annotation
Analysis results can be either saved to the MATLAB tion module to SATB and the MATLAB workspace. Cus- on the string section, percussionist, trumpet player, or module.
workspace or external storage facilitating large data anal- tom third-party contributed feature exaction implementa- experience what the conductor might be hearing on stage, For sMAT we are currently folding in Park’s un-
ysis as well as batch processing. Each feature extraction tion simply require (1) naming the file as either standing on a podium … or what it might sound like fac- published software called soundpath from 2009 that fo-
implementation inherits from an analysis superclass td_featureName.m or fd_featureName.m, (2) ing the audience from the stage rather than the other away cuses on spatio-temporal paths as a metaphor for mixing,
which handles I/O “behind the scenes” as further summa- saving the .m file in the SATB ./features directory, around as is more common. All sMAT sessions can be modulation, chronicling, and annotating event along
rized in Section 2.1.1. Each feature extraction module can (3) adding feature dependencies (e.g. saved and later recalled. “sound paths” in the sMAT space. These spatiotemporal
optionally include a custom visualization method that can “td_spectralCentroid”), and finally (4) imple- soundpaths are played back with other synchronized in-
be used to display data in specific formats and configura- menting the feature extraction superclass’ analysis() formation, data, and modalities such as historical infor-
tions. Additionally, data management uses MATLAB han- method. Everything else is automatically handled by the mation, technical details, and musical moments in a com-
dles/references to help in minimizing duplication of data, SATB system, including passing appropriate input vec- position, soundscape or audio signal. In this context, not
easy session organization, and cleanup of SATB sessions tors to the feature extraction module and saving results. only can sMAT be used for exploration in real-time but
– this is useful where data management is somewhat lax, The analysis() method for the RMS algorithm is also in non-real time, especially in education settings
especially during data visualization where full resolution shown in Figure 6. where students or instructors can develop narratives to
data oftentimes exist both in the MATLAB workspace a Additional abstract methods include initialization, pre- communicate and convey important musical ideas.
figure. processing, and visualization methods to allow customi- Finally, a more long-term sub-module we plan on add-
Feature extraction simply begins by creating an SATB zation of the user’s feature extraction module. However, ing to SATB is feature modulation synthesis (FMS) re-
instance, creating a new session, and computing features for most cases, only the analysis abstract method search [9]. FMS is a feature-centric sound synthesis-by-
from audio files in external storage devices or vectors in needs to be customized. Figure 8. sMAT Listener stage exploration analysis approach where proof-of-concepts have been
the MATLAB workspace. When no options are provided to implemented in the MATLAB environment. The inclusion
Figure 8 shows the positioning of the user’s listening spot
the SATB constructor, default parameters such as window towards the backside of the string section (stage right). of FMS in SATB will allow for feature modulation based
size, hop size, analysis window type, and sampling rate Here we note that the listening spot visually highlighting sound synthesis exploration – e.g. modulating harmonic
are used for analysis (these default parameters are also a particular section of the stage/orchestra. Once a session expansion/compression of stringed instrument sounds
user-customizable in the SATB configuration file – the is set up, exploring the space by moving the listening which can be used from both creative and research per-
last session’s parameters are used). A new session will spot, changing perspectives with 3D rotation tool, or spectives.
allow optional creation of a session directory, prompt the zooming in/out of desired locations in the space are some
user for audio file information, and save all analysis re- of the ways sMAT can be used for engaging in spatio- 4. CONCLUSIONS
sults organized by combining audio file information, temporal sound exploration. The current implementation
renders a two-channel audio stream that changes accord- In this paper, we introduced the Sound Analysis Toolbox
analysis types, and feature type. Each session produces an
ing to the location of the listening spot. The net audio is (SATB) and our currently implemented modules for vis-
associated SATB sessionName.mat file that contains
computed as a function of three-dimensional coordinates ualization, sono-visual interaction, and feature extraction.
session settings and configurations including dataset in-
of all microphones and panning information. We summarized some of SATB’s features including effi-
formation, analysis, pre-processing, and visualization
sMat may be used in numerous situations including ex- cient plotting with splot taking advantage of computer
parameters. The SATB feature extraction algorithms can
ploration of mixing multi track recordings, diffusion mul- display limitations, flexible and expandable analysis
be used on a single vector/audio file or a set of vec-
tichannel audio playback environments, or exploring module and feature extraction APIs, and sMAT as a spa-
tors/audio files that can be selected as part of the ses-
Figure 7. sMAT Listener soundscapes as is currently being developed as part of tiotemporal sound exploration tool. Our hope is that
sion’s analysis file directory. Other features include by-
our Citygram project [10]–[14]. SATB will contribute in facilitating exploration of music
passing already computed feature vector outputs, select-
2.3 sMAT Listener and sound for our community of audio, music, and sound
ing feature subsets for analysis, and batch processing of
large set of files. The SATB-Matrix Listener (sMAT) module provides a
3. FUTURE WORK researchers, enthusiasts, musicians, composers, educa-
three-dimensional, audio source matrix-based sound ex- tors, and students alike.
We plan to release SATB in the fall of 2016 and much
function analysis() ploration environment where audio stems/tracks are posi- (exciting) work still remains (please refer to
startIdx = 1;
tioned with a 3D virtual space as shown in Figure 7. Each citygram.smusic.nyu.edu for links/updates to 5. REFERENCES
endIdx = this.winSize;
sMAT session can be set up with three general parame- repos) including providing options for “envelope” com- [1] T. Park, D. Hyman, P. Leonard, and W. Wu,
for i=1:this.numOfWin ters: (1) “stage” image file, (2) “microphone” positions, putation algorithms in addition to our current min-max. In “Systematic and quantative electro-acoustic
this.data.rms(i) = ... and (3) initial listening coordinates (or listening spot).
(mean(this. pcm(startIdx:endIdx).^2))^0.5;
particular, for our analysis module, we aim to finish up an music analysis (sqema),” in International
The image file is used to visually represent a space such API for acoustic event detection (AED) and acoustic Computer Music Conference Proceedings
startIdx = startIdx + this. hopSize; as a concert hall “stage” (e.g. Lincoln Center concert hall event classification (AED). Additionally, we have devel- (ICMC), 2010, pp. 199–206.
endIdx = endIdx + this.hopSize; stage) with a matrix of “microphone” locations (longi- oped an online sound event annotation module and are in [2] T. Park, Z. Li, and W. Wu, “EASY Does it,”
end tude, latitude, and elevation). The microphones are essen- … Int. Soc. Music …, 2009.
end the midst of porting it to JavaScript and WebAudio
tially audio files that can be loaded into sMAT where the for added flexibility. This effort has been developed as [3] T. H. Park, J. Lee, J. You, M.-J. Yoo, and J.
Figure 6. Simple RMS “plug-in”
microphone locations are randomly spread throughout the part of our soundscape mapping initiatives embracing a Turner, “Towards Soundscape Information
space when initialized for the first time. The user can multi-listener labeling/annotation philosophy, rather than Retrieval (SIR),” in Proceedings of the
2.2.1 Custom feature extraction algorithms and API
then position the “mic nodes” (i.e. soundfiles) within the exclusively relying on one or two researchers’ judgments
Although SATB currently includes a modest 17-feature 2
sMAT 3D space. In the example shown in Figure 7, 22 for annotative ground truth. Additionally, we will include http://www.freesound.org
analysis module, our API allows for easy customization 3
audio files corresponding to 22 microphone locations are http://citygram.smusic.nyu.edu

239
238 Proceedings of the International Computer Music Conference 2016 Proceedings of the International Computer Music Conference 2016 240
ing additional, custom feature extraction modules, (3) and development of additional feature extraction algo- mapped in 3D space. The listening spot is a 3D observa- database exploration/querying modules for two databases
visualization: options for adding custom/specialized visu- rithms. SATB’s “plug-in” development architecture is tion coordinate that can be freely moved around the – Freesound2 and Citygram3. This will allow for easy
alization for any feature extraction module, and (4) data straightforward in that it inherits all necessary methods sMAT space simulating a virtual, “on-stage” listening access to databases including downloading of audio data,
management: using handles/references whenever possible from its analysis superclass and handles appropriate in- experience: selecting and moving the listening spot labels, and other metadata – directly from MATLAB. This
to minimize system resources. put/output vector passing to and from each feature extrac- around the 3D space allows, for example, to “eavesdrop” feature will be integrated with our sound annotation
Analysis results can be either saved to the MATLAB tion module to SATB and the MATLAB workspace. Cus- on the string section, percussionist, trumpet player, or module.
workspace or external storage facilitating large data anal- tom third-party contributed feature exaction implementa- experience what the conductor might be hearing on stage, For sMAT we are currently folding in Park’s un-
ysis as well as batch processing. Each feature extraction tion simply require (1) naming the file as either standing on a podium … or what it might sound like fac- published software called soundpath from 2009 that fo-
implementation inherits from an analysis superclass td_featureName.m or fd_featureName.m, (2) ing the audience from the stage rather than the other away cuses on spatio-temporal paths as a metaphor for mixing,
which handles I/O “behind the scenes” as further summa- saving the .m file in the SATB ./features directory, around as is more common. All sMAT sessions can be modulation, chronicling, and annotating event along
rized in Section 2.1.1. Each feature extraction module can (3) adding feature dependencies (e.g. saved and later recalled. “sound paths” in the sMAT space. These spatiotemporal
optionally include a custom visualization method that can “td_spectralCentroid”), and finally (4) imple- soundpaths are played back with other synchronized in-
be used to display data in specific formats and configura- menting the feature extraction superclass’ analysis() formation, data, and modalities such as historical infor-
tions. Additionally, data management uses MATLAB han- method. Everything else is automatically handled by the mation, technical details, and musical moments in a com-
dles/references to help in minimizing duplication of data, SATB system, including passing appropriate input vec- position, soundscape or audio signal. In this context, not
easy session organization, and cleanup of SATB sessions tors to the feature extraction module and saving results. only can sMAT be used for exploration in real-time but
– this is useful where data management is somewhat lax, The analysis() method for the RMS algorithm is also in non-real time, especially in education settings
especially during data visualization where full resolution shown in Figure 6. where students or instructors can develop narratives to
data oftentimes exist both in the MATLAB workspace a Additional abstract methods include initialization, pre- communicate and convey important musical ideas.
figure. processing, and visualization methods to allow customi- Finally, a more long-term sub-module we plan on add-
Feature extraction simply begins by creating an SATB zation of the user’s feature extraction module. However, ing to SATB is feature modulation synthesis (FMS) re-
instance, creating a new session, and computing features for most cases, only the analysis abstract method search [9]. FMS is a feature-centric sound synthesis-by-
from audio files in external storage devices or vectors in needs to be customized. Figure 8. sMAT Listener stage exploration analysis approach where proof-of-concepts have been
the MATLAB workspace. When no options are provided to implemented in the MATLAB environment. The inclusion
Figure 8 shows the positioning of the user’s listening spot
the SATB constructor, default parameters such as window towards the backside of the string section (stage right). of FMS in SATB will allow for feature modulation based
size, hop size, analysis window type, and sampling rate Here we note that the listening spot visually highlighting sound synthesis exploration – e.g. modulating harmonic
are used for analysis (these default parameters are also a particular section of the stage/orchestra. Once a session expansion/compression of stringed instrument sounds
user-customizable in the SATB configuration file – the is set up, exploring the space by moving the listening which can be used from both creative and research per-
last session’s parameters are used). A new session will spot, changing perspectives with 3D rotation tool, or spectives.
allow optional creation of a session directory, prompt the zooming in/out of desired locations in the space are some
user for audio file information, and save all analysis re- of the ways sMAT can be used for engaging in spatio- 4. CONCLUSIONS
sults organized by combining audio file information, temporal sound exploration. The current implementation
renders a two-channel audio stream that changes accord- In this paper, we introduced the Sound Analysis Toolbox
analysis types, and feature type. Each session produces an
ing to the location of the listening spot. The net audio is (SATB) and our currently implemented modules for vis-
associated SATB sessionName.mat file that contains
computed as a function of three-dimensional coordinates ualization, sono-visual interaction, and feature extraction.
session settings and configurations including dataset in-
of all microphones and panning information. We summarized some of SATB’s features including effi-
formation, analysis, pre-processing, and visualization
sMat may be used in numerous situations including ex- cient plotting with splot taking advantage of computer
parameters. The SATB feature extraction algorithms can
ploration of mixing multi track recordings, diffusion mul- display limitations, flexible and expandable analysis
be used on a single vector/audio file or a set of vec-
tichannel audio playback environments, or exploring module and feature extraction APIs, and sMAT as a spa-
tors/audio files that can be selected as part of the ses-
Figure 7. sMAT Listener soundscapes as is currently being developed as part of tiotemporal sound exploration tool. Our hope is that
sion’s analysis file directory. Other features include by-
our Citygram project [10]–[14]. SATB will contribute in facilitating exploration of music
passing already computed feature vector outputs, select-
2.3 sMAT Listener and sound for our community of audio, music, and sound
ing feature subsets for analysis, and batch processing of
large set of files. The SATB-Matrix Listener (sMAT) module provides a
3. FUTURE WORK researchers, enthusiasts, musicians, composers, educa-
three-dimensional, audio source matrix-based sound ex- tors, and students alike.
We plan to release SATB in the fall of 2016 and much
function analysis() ploration environment where audio stems/tracks are posi- (exciting) work still remains (please refer to
startIdx = 1;
tioned with a 3D virtual space as shown in Figure 7. Each citygram.smusic.nyu.edu for links/updates to 5. REFERENCES
endIdx = this.winSize;
sMAT session can be set up with three general parame- repos) including providing options for “envelope” com- [1] T. Park, D. Hyman, P. Leonard, and W. Wu,
for i=1:this.numOfWin ters: (1) “stage” image file, (2) “microphone” positions, putation algorithms in addition to our current min-max. In “Systematic and quantative electro-acoustic
this.data.rms(i) = ... and (3) initial listening coordinates (or listening spot).
(mean(this. pcm(startIdx:endIdx).^2))^0.5;
particular, for our analysis module, we aim to finish up an music analysis (sqema),” in International
The image file is used to visually represent a space such API for acoustic event detection (AED) and acoustic Computer Music Conference Proceedings
startIdx = startIdx + this. hopSize; as a concert hall “stage” (e.g. Lincoln Center concert hall event classification (AED). Additionally, we have devel- (ICMC), 2010, pp. 199–206.
endIdx = endIdx + this.hopSize; stage) with a matrix of “microphone” locations (longi- oped an online sound event annotation module and are in [2] T. Park, Z. Li, and W. Wu, “EASY Does it,”
end tude, latitude, and elevation). The microphones are essen- … Int. Soc. Music …, 2009.
end the midst of porting it to JavaScript and WebAudio
tially audio files that can be loaded into sMAT where the for added flexibility. This effort has been developed as [3] T. H. Park, J. Lee, J. You, M.-J. Yoo, and J.
Figure 6. Simple RMS “plug-in”
microphone locations are randomly spread throughout the part of our soundscape mapping initiatives embracing a Turner, “Towards Soundscape Information
space when initialized for the first time. The user can multi-listener labeling/annotation philosophy, rather than Retrieval (SIR),” in Proceedings of the
2.2.1 Custom feature extraction algorithms and API
then position the “mic nodes” (i.e. soundfiles) within the exclusively relying on one or two researchers’ judgments
Although SATB currently includes a modest 17-feature 2
sMAT 3D space. In the example shown in Figure 7, 22 for annotative ground truth. Additionally, we will include http://www.freesound.org
analysis module, our API allows for easy customization 3
audio files corresponding to 22 microphone locations are http://citygram.smusic.nyu.edu

238 Proceedings of the International Computer Music Conference 2016 Proceedings of the International Computer Music Conference 2016 239
International Computer Music Conference 2014,
2014.
[4] J. Beskow and K. Sjölander, “WaveSurfer-a Short overview in parametric loudspeakers array
public domain speech tool,” Proc. ICSLP 2000,
2000.
technology and its implications in spatialization in electronic
[5] T. Giannakopoulos, “pyAudioAnalysis: An music
Open-Source Python Library for Audio Signal
Analysis,” PLoS One, vol. 10, no. 12, p. Jaime Reis
e0144610, Dec. 2015. INET-md (FCSH-UNL), Festival DME, Portugal
[6] O. Lartillot and P. Toiviainen, “A Matlab toolbox jaimereis.pt@gmail.com
for musical feature extraction from audio,” Int.
Conf. Digit. Audio …, 2007.
[7] S. E. Meinard Müller, “Chroma Toolbox:
MATLAB implementations for extracting computer as a source of musical sounds” [3], a
variants of chroma-based audio features.” text that then was mentioned by composers
ABSTRACT
[8] E. Pampalk, “A Matlab Toolbox to Compute who changed the history of computer music,
Music Similarity from Audio.,” ISMIR, 2004. such as John Chowning, as very promising
[9] T. Park and Z. Li, “Not just prettier: FMS toolbox In late December of 1962, a Physics Professor ideas [4], who certainly influenced this and
marches on,” Proc. ICMC 2009, 2009. from Brown University, Peter J. Westervelt, other composers.
[10] T. H. Park, B. Miller, A. Shrestha, S. Lee, J. submitted a paper called Parametric Acoustic
A relation between Westervelt
Turner, and A. Marse, “Citygram One : Array [1] considered primary waves interacting
discoveries and further developments in
Visualizing Urban Acoustic Ecology,” in within a given volume and calculated the
scattered pressure field due to the non-linearities
parametric loudspeakers array technology were
Proceedings of the Conference on Digital described by Croft and Norris [2], including
within a small portion of this common volume in
Humanities 2012, 2012. the technological developments by different
the medium [2]. Since then, many outputs of this
[11] C. Shamoon and T. Park, “New York City’s New scientists and in different countries and how it
technology were developed and applied in
Noise Code and NYU's Citygram-Sound has moved from theory and experimentation to
contexts such as military, tomography, sonar
Project,” in … -NOISE and NOISE-CON technology, artistic installations and others. implementation and application.
Congress and …, 2014. Such technology allows perfect sound It’s important to clear that such
[12] T. H. Park, J. Turner, J. You, J. H. Lee, and M. directionally and therefore peculiar expressive terminology isn’t fixed and that it’s possible to
Musick, “Towards Soundscape Information techniques in electroacoustic music, allowing a find different definitions to similar projects
Retrieval (SIR),” in International Computer very particular music dimension of space. For (commercial, scientific or of other nature),
Music Conference Proceedings (ICMC), 2014. such reason, it’s here treated as a idiosyncrasy uses, products and implementations of such
[13] T. H. Park, J. Turner, M. Musick, J. H. Lee, C. worth to discuss on its on terms. theoretical background, sometimes even by the
Jacoby, C. Mydlarz, and J. Salamon, “Sensing In 2010-2011 I composed the piece "A same authors and in the same articles. Some of
Urban Soundscapes,” in Workshop on Mining Anamnese das Constantes Ocultas", them being “parametric loudspeakers” [2], [5],
Urban Data, 2014. commissioned by Grupo de Música “parametric speakers “ [6], [7], “parametric
[14] C. Mydlarz, S. Nacach, T. Park, and A. Roginska, Contemporânea de Lisboa, that used a acoustic array” [1], [8], “parametric array” [5],
“The design of urban sound monitoring devices,” parametric loudspeakers array developed by [7], “parametric audio system” [9],
Audio Eng. Soc. …, 2014. engineer Joel Paulo. The same technology was
“hypersonic sound” [10], “beam of sound” [1],
used in the 2015 acousmatic piece “Jeux de
“audible sound beams” [11], “superdirectional
l'Espace” for eight loudspeakers and one
parametric loudspeaker array.
sound beams” [12], “super directional
This paper is organized as follows. A loudspeaker” [13], “focused audio” [14],
theoretical framework of the parametric “audio spotlight” [15], [16], “phased array
loudspeaker array is first introduced, followed by sound system” [17], among others. The term
a brief description of the main theoretical aspects PLA is being used here since it seems to
of such loudspeakers. Secondly, there is a reunite the main concepts that converge in this
description of practices that use such technology technology. Nevertheless, it isn’t meant to be
and their applications. The final section presented as an improved terminology over
describes how I have used it in my music others. This discussion solely has the purpose
compositions. of showing that one who might not be familiar
with such technology, and wish to research
more about it, will find different terms that
1. Introduction were originated due to particular historical
contexts, manufacturers patents and arbitrary
The fundamental theoretical principles of a grounds.
parametric loudspeaker array (PLA) were
discovered and explained by Westervelt [1]. 2. Theoretical framework
Interestingly, this was in the same year of the
publication of an article by Max Mathews A parametric loudspeaker is guided by a
where the author said there were “no principle described by Westervelt as:
theoretical limits to the performance of the two plane waves of differing frequencies

240
241 Proceedings of the International Computer Music Conference 2016 Proceedings of the International Computer Music Conference 2016 242

You might also like