You are on page 1of 18

Ann. N.Y. Acad. Sci.

ISSN 0077-8923

A N N A L S O F T H E N E W Y O R K A C A D E M Y O F SC I E N C E S
Issue: The Year in Cognitive Neuroscience

Hierarchical processing in music, language, and action:


Lashley revisited
W. Tecumseh Fitch1 and Mauricio D. Martins1,2
1 2
Department of Cognitive Biology, University of Vienna, Vienna, Austria Language Research Laboratory, Lisbon Faculty of
Medicine, Lisbon, Portugal

Address for correspondence: W. Tecumseh Fitch, Department of Cognitive Biology, University of Vienna, 14 Althanstrasse
A-1090, Vienna, Austria. tecumseh.fitch@univie.ac.at

Sixty years ago, Karl Lashley suggested that complex action sequences, from simple motor acts to language and music,
are a fundamental but neglected aspect of neural function. Lashley demonstrated the inadequacy of then-standard
models of associative chaining, positing a more flexible and generalized “syntax of action” necessary to encompass key
aspects of language and music. He suggested that hierarchy in language and music builds upon a more basic sequential
action system, and provided several concrete hypotheses about the nature of this system. Here, we review a diverse
set of modern data concerning musical, linguistic, and other action processing, finding them largely consistent with
an updated neuroanatomical version of Lashley’s hypotheses. In particular, the lateral premotor cortex, including
Broca’s area, plays important roles in hierarchical processing in language, music, and at least some action sequences.
Although the precise computational function of the lateral prefrontal regions in action syntax remains debated,
Lashley’s notion—that this cortical region implements a working-memory buffer or stack scannable by posterior
and subcortical brain regions—is consistent with considerable experimental data.

Keywords: hierarchy; language; music; computation; syntax

Introduction completed action serves as the stimulus for the next


one, and omission of any step would cause the se-
In 1951, Karl Lashley1 suggested that the human ca- quence to grind to a halt. This and other phenom-
pacity for serial ordering of action, whether of lan- ena, argued Lashley, demonstrate the need for hier-
guage or music or simply making coffee, presents archical models of action planning—a requirement
central problems for our understanding of brain widely acknowledged today.2
function. Lashley observed that then-prevailing Lashley further observed that core aspects of spo-
models of action based on simple stimulus–response ken language and music share this need for hier-
chaining had a fundamental flaw: although such archical structure, at multiple levels. For example,
serial models can capture the “one thing after an- when pronouncing the words tire and right, the
other” characteristic of routine action, they fail to same basic vocal gestures appear but in different or-
recognize the central importance of sustained goals ders. How could a chaining system cope with such
and subgoals in more complex action. In complex flexible rearrangement of basic actions into multiple
actions, overall goals (e.g., making coffee) must larger composites? Phrasal syntax makes the point
persist over time while subgoals are initiated and even more clearly. When reading the sentence, “The
completed (e.g., grinding beans, heating water, and boy who patted the dog chased the girl,” an En-
adding cream) or interruptions occur (e.g., a phone glish speaker knows that the boy, and not the dog,
call). Several aspects of such action sequences are chased the girl, despite the fact that this sentence
inconsistent with a chaining model. Most obvious contains the sequence “the dog chased the girl.”
are errors of omission, where some step is acciden- Linguistic syntax illustrates that serial order alone
tally left out (e.g., starting the coffee machine with- provides an inadequate basis for language process-
out first adding coffee). In chaining models, each ing, and that processes involving longer temporal

doi: 10.1111/nyas.12406
Ann. N.Y. Acad. Sci. 1316 (2014) 87–104  C 2014 The Authors. Annals of the New York Academy of Sciences 87
published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and
distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
Music, language, and action hierarchical processing Fitch & Martins

scales must persist during the processing or produc- neural and computational viewpoints. More pre-
tion of shorter actions or simple chains. Such ob- cise computational models and experimental de-
servations were foundational in early generative lin- signs will be required to further evaluate this hy-
guistics, which recognized the central role of phrase pothesis, which is quite different from currently
structure, rather than word order, in syntax.3,4 popular hypotheses based upon mirror neurons and
Lashley also suggested that these similarities in embodiment. Thus, 60 years on, Lashley’s insights
language, music, and other complex actions are not remain valid and intriguing and can inform current
coincidental, instead hypothesizing that the hierar- neuroscientific debates, and thus deserve renewed
chical nature of music and language was inherited attention.
from more basic features of motor planning and
Defining temporal hierarchy: hierarchical
action.
sequences and hierarchical sets
In this review, we reassess Lashley’s observations
and hypotheses from a modern perspective, con- Terms like syntax and hierarchy are notoriously am-
cluding that his ideas were remarkably insightful biguous, so our first task in evaluating Lashley’s hy-
and consistent with considerable new behavioral potheses is to distinguish possible interpretations of
and neuroscientific data. We explore Lashley’s idea the term hierarchy, which has many meanings in
that hierarchical structuring of temporal sequences neuroscience. Especially in older literature, hierar-
is a key capability underlying human music and lan- chy connotes little more than the fact that cortical
guage, but one that has deeper phylogenetic roots regions are above or below one another in process-
in the domain of action. Unlike Lashley, who es- ing terms, such that bottom–up processing proceeds
chewed localizationist arguments, we make full use from regions lower to regions higher in the visual hi-
of brain imaging and lesion data as supporting evi- erarchy. A central characteristic of the visual system
dence, with a particular focus on the role of Broca’s is that neurons in cortical regions closer to retinal
area in processing hierarchical structures in the time input, such as V1, have smaller receptive fields than
domain. those in higher regions like V2. Our focus goes be-
Specifically, we will consider the following issues: yond this basic and widespread form of hierarchical
(1) whether processing of sequential hierarchy is layout of cortical regions themselves.
a well-defined specific capability, independent of Here, we adopt a more specific and precise no-
other hierarchical processing (e.g., visuospatial) but tion of hierarchy to denote a tree-like organiza-
applicable to both music and language; (2) to what tion, where higher levels incorporate multiple lower
extent the mechanisms subserving linguistic and levels in structural representations and/or process-
musical syntax inherit properties from those typ- ing. This notion is wholly independent of how the
ifying a syntax of action more generally; and (3) cortical regions that process such stimuli are ar-
whether humans are unusual in the degree to which ranged, entailing no specific commitments about
hierarchical abilities are developed, and if so, which neuroanatomical localization.
neural mechanisms support this proclivity. To be more exact we utilize the follow-
We evaluate these questions using behavioral, ing mathematically-based definitions5,6 throughout
neuroanatomical, and brain-imaging data that go this paper: a set is an unordered collection of dis-
well beyond the data available in Lashley’s day. We tinct, unique objects; while a sequence is a collection
conclude that these data converge upon the notion of objects, perhaps including duplicates, ordered by
of a distinct neural substrate for hierarchical se- some rule.
quence processing, unusually well developed in hu- Although the set {a, b, c} is identical to the set
mans relative to other primates, and reflected in {b, c, a}, the sequences [a b c] and [b c a] are distinct
specific expansions of prefrontal brain regions in- and different. Furthermore, because sequences but
cluding Broca’s area, as well as a considerable re- not sets can contain duplicate items, sequences are
modeling of connections between prefrontal and not, strictly speaking, a type of set. These distinc-
posterior regions. Although the degree to which tions are crucial in distinguishing Lashley’s tem-
music and language inherit key properties from ac- porally ordered sequences, where order matters,
tion syntax remains controversial, available data are from static hierarchy, in which order is typically
consistent with overlap in these systems from both irrelevant.

88 Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences

published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Fitch & Martins Music, language, and action hierarchical processing

Hierarchy denotes a set or sequence of elements the same meaning must be expressed in a sequence:
connected in the form of a rooted tree (a connected different “schemas of order.” Typing or speech er-
acyclic graph, in which one element is singled out rors (e.g., spoonerisms) or children’s games like
as the root element). Hierarchies thus possess the pig Latin provide evidence that the temporal or-
following key properties: (1) all elements are com- dering component is both independent of seman-
bined into one structure (connectedness); (2) one tics, and a general characteristic of many “general-
element (the root) is superior to all others; and (3) ized schemata of action” that apply across the three
no element is superior to itself (that is, there are no domains. Thus, it is not the general problem of
cycles, direct or indirect). hierarchy but the specific issue of hierarchical se-
These definitions allow us to clarify a core distinc- quencing that Lashley considered the essential prob-
tion between temporally ordered and unordered hi- lem of serial order, and that will be our focus in this
erarchies. Examples of hierarchical sets include visu- review.
ospatial part–whole hierarchies, such as the schema Lashley observed that we can often convert be-
of a face, in which certain components contain oth- tween spatial and temporal hierarchies. For instance,
ers without any necessary sequential ordering of to reverse a piano melody, he stated, “I can only do it
components at each level. A face incorporates the by visualizing the music spatially and then reading
set {eyes, nose, mouth}, but there is no need to con- it backward” (p. 129). Similarly, when translating
sider one component to come first. Such hierarchies between languages with different word orders, such
involve the minimal structure required to consider as English and German, a fluent bilingual can ex-
a representation hierarchical, namely that the supe- tract the meaning from one language, and convert
rior/inferior relation between levels is specified. But it to a properly sequenced sentence in the other,
at any given level, the elements form an unordered freely reordering the individual words as needed.
set. It is therefore possible that the distinction between
In hierarchical sequences, in contrast, sequential sequential and nonordered hierarchy is of little im-
order matters. The simple two-syllable utterances portance at a fundamental neural or computational
“callow” and “low-cal” contain the same set of sylla- level. This is the first issue that we address using
bles, but arranged in different orders, and they rep- modern imaging data.
resent different concepts. Similarly, in music each
Sequential hierarchical processing is not
arrangement of a set of notes constitutes a different
colocalized with visual hierarchy systems
melody. Finally, in action sequences, order is often
important: if we grind the coffee beans at the end As noted above, there is often an increase in spatial
of the action sequence, after brewing is completed, receptive field size in cortical regions more distant
poor coffee results. Thus temporal hierarchies incor- from sensory input; this involves a widespread neu-
porate an additional ordering component, where at ral form of spatial hierarchy, where higher regions
least some elements at any given level represent a integrate information from multiple lower regions.7
sequence rather than a set. In this functional sense, hierarchical processing is a
Lashley considered hierarchical structure to be an widespread property of brain systems.8–16
obvious component of language, music, and action. In some cases, such patterns of neuroanatomi-
In each domain, some superordinate structure (his cal layout may reflect a functional form of cogni-
“determining tendency”) is required to lend coher- tive hierarchy. For instance, in the motor domain,
ence to subordinate components (speech sounds, posterior frontal lobe regions encode basic move-
notes, or simple actions). Although these higher ments (extending the arm), whereas anterior regions
level elements may differ considerably between do- fire more selectively to particular contexts in which
mains (e.g., meaning for language versus an overall these movements are performed (e.g., making cof-
goal for action), Lashley’s focus was what all three fee and shaking hands).10 In the auditory cortex,
domains have in common: the need to impose the core areas respond to simple acoustic characteris-
correct temporal ordering on the subelements. He tics, whereas anterior regions along the supratem-
cites, as a key example, language, where the elements poral plane display selective firing to more com-
of meaning are often cotemporal sets, but different plex sound categories.12 In the visual ventral stream,
languages have different ordering rules by which anterior regions in the temporal cortex support

Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences 89
published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Music, language, and action hierarchical processing Fitch & Martins

categorical decisions of greater abstraction (e.g., an- Furthermore, in musical syntax, hierarchically
imal versus nonanimal) than posterior regions (e.g., dependent processing can stretch over long time
blue versus green).14 Given these pervasive patterns spans (long-distance dependency).33 Neuroimaging
of brain organization, is there evidence for more ab- studies on musical syntax (as probed with sequential
stract and modality independent hierarchical struc- harmonic or melodic structures) also consistently
turing in the brain? reveal activation in BA 44/45, typically in both the
Starting with hierarchical sets, many recent stud- right and left hemispheres, but sometimes biased to
ies explore the brain regions involved in represent- the right side.34–39 Evidence that these prefrontal ac-
ing and processing hierarchical structures in both tivations are not restricted to meaningful language
spatial and social domains.17,18 Representation of or music comes from artificial grammar research
hierarchical structures in the social domain, which using nonsense syllables (spoken) or words (writ-
allows the evaluation of dominance relationships, ten) arranged according to particular rules. Again,
appear to be encoded in the hippocampus.18 How- irrespective of the input domain, both auditory
ever, this structure is also active in encoding hier- and visual patterns, and even nonlinguistic visual
archical ranks in nonsocial domains.18 In the visu- symbols,40 elicit Broca’s activation when they are
ospatial domain, for example, correct integration sequentially presented. On the basis of these data,
of landmarks within context frames recruits the further explored below, we conclude that, despite
parahippocampus,8 and also requires the integrity MTL overlaps, the neural mechanisms for process-
of the medial temporal lobe (MTL) and the retros- ing hierarchical sequences are not fully colocalized
plenial cortex.17 with those that process hierarchical sets. Despite for-
Interestingly, this same MTL system may also en- mal similarities in these abstract hierarchical struc-
code high order hierarchical associations in the mo- tures, these data argue against their identity. We
tor and linguistic domains.19–22 It has been proposed now explore the functions of the prefrontal regions
that these medial structures are important for bind- preferentially involved in hierarchical sequence
ing items within contexts,23,24 even for offline imag- processing.
ined scenes,25 as well as for episodic memory, which
Prefrontal cortex and Broca’s area play
involves binding different perceptual and memory
central roles in processing hierarchical
features into unified representations.26,27
syntax
In addition to these brain regions, a quite different
set of activations is typically observed in temporal We first attempt to characterize the capacity for pro-
hierarchical processing, regardless of whether these cessing sequential hierarchies more precisely, to bet-
involve musical, linguistic, or other tasks. A consis- ter understand its neural basis. Although Lashley
tent finding within the neurolinguistic literature is was a confirmed agnostic regarding neural local-
that sequentially structured hierarchical processing ization, there is a long tradition of associating the
activates the posterior prefrontal cortex, centering frontal lobes, specifically the prefrontal cortex (the
on Broca’s area (a term we will use hereafter to de- portion of the frontal lobe anterior to the primary
note both Brodmann’s areas [BA] 44 and 45, typ- motor strip and the premotor cortex), with the or-
ically in the left hemisphere, but not neighboring ganization of complex plans in general, and hier-
areas28 ). Increased Broca’s activation typically cor- archical sequences in particular, in multiple cogni-
relates with increased demands on working mem- tive domains. The notion that the prefrontal cortex
ory, for example, when processing long-distance is specifically involved in action planning and exec-
dependencies, where early information must re- utive control has been advanced notably by Luria,41
main active for proper interpretation of later ele- Shallice,42 and Baddeley,43 among many others.
ments. Ongoing controversy concerns the degree to Clinical data leave little doubt that pathology in
which this active maintenance of past information the frontal lobe can lead to disorganized action.44
is specific to hierarchical sequences per se22,29,30 or We do not review this extensive literature here, but
is found for nonhierarchical sequences as well,31,32 simply take it for granted that the frontal lobes
but it is clear that increasingly complex hierarchical play important roles in planning, working memory,
structures lead to greater Broca’s activation across and executive control. Our goal will be to specify
multiple languages and laboratories. this contribution more precisely, paying particular

90 Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences

published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Fitch & Martins Music, language, and action hierarchical processing

attention to the role of the inferior frontal gyrus IFG (centered on BA 45). The clever design controls
(IFG) and Broca’s area. for both semantics and word transition probability
We note first that virtually any motor behavior (because the nonsense words, by definition, have
can be conceptualized as hierarchical. A monkey eat- zero probability), and confirmed a specific role for
ing peanuts involves a set of actions (grasp, ingest, Broca’s area in processing hierarchical constituent
chew) each of which can be further broken down structure in language.
(grasp: extend arm, open fingers, close fingers, re-
Broca’s region is involved in processing
tract arm). Thus, any action sequence that involves
musical harmonic syntax
repeated subparts displays a simple form of hierar-
chy, pervasive to all motor action in all species.45 Regarding specificity of function, however, there has
Such hierarchies have a fixed and limited depth, and been considerable debate about the degree to which
were not Lashley’s focus (because they can be ac- the function of Broca’s area is specific to syntax,
counted for by chaining). To clarify what he was or whether it performs one or more computational
getting at, we start with the clear case of language, functions that are shared by multiple aspects of lan-
and use this to guide our further inquiry through guage, including phonology or semantics,50 or even
music, back to action. more broad cognition including cognitive control,
Broca’s area (BA 44 and 45 or, roughly, the pars action planning, and music.35,54–56
opercularis and the pars triangularis of the IFG) The best characterized, and to us most convinc-
is one of the most intensively studied brain re- ing, nonlinguistic function of the IFG is in music
gions, and its importance for speech production processing and memory.57–60 There is a long tra-
requires little emphasis.46–48 Since the 1970s, this dition, based on dissociations in lesion patients, of
region has also been recognized in clinical studies considering music and language to be instantiated in
to play a special role in syntax, both production and distinct brain regions.61 Initial evidence for a shared
perception.49 Thus, a broad consensus exists that role for the IFG in musical processing, particu-
this region of the IFG plays an important role in larly processing harmonic syntax, came from event-
language processing.30,50–52 related potential (ERP) and magnetoencephalogra-
An elegant study by Pallier et al.53 explored the phy (MEG) studies.36,37,62 Furthermore, there are
specificity of IFG involvement in hierarchical syn- multiple studies indicating overlap of both cogni-
tax by crossing two manipulations. In the first, they tive and neural resources used in music-syntactic
parametrically varied the chunk size of 12-word and language-syntactic processing based on interac-
sentences, spanning from scrambled words (where tions between the two domains.63–65 These studies
there are no chunks, giving a constituent size of one) showed consistent similarities in early, anteriorly lo-
to full sentences (constituent size = 12). Intermedi- cated reactions to anomalous items in linguistic and
ate strings had constituent sizes of two, three, four, musical stimuli, and the MEG study by Maess et al.37
and six words (an example with four-word chunks concluded simply that “musical syntax is processed
would be “mayor of the city/he hates this color/they in Broca’s area.”
read their names”). The prediction was that acti- These results, from normal subjects, stand in con-
vation in a region specifically encoding constituent trast to the long clinical tradition identifying pa-
structure would increase with greater constituent tients with damage to music and not language, or
size. Such a correlation was indeed found, includ- vice versa.61 Recent commentators57,58 have recon-
ing both IFG structures and temporal lobe areas, ciled this apparent discrepancy by noting that al-
especially along the superior temporal sulcus. though certain abstract operations may be shared by
To get at syntax specificity, a second manipulation music and language, the units over which these func-
involved substituting all content words with non- tions operate are clearly quite different (phonemes
sense words to create “jabberwocky” constituents of and words for language, notes and chords for mu-
varying sizes.53 For example, the four-word jabber- sic). Furthermore, there are fundamental differ-
wocky could read “tuyor of the roty/he futes this da- ences among different components of music. For
tor/they gead their wames.” This parametric manip- example, a key component of musical rhythm is
ulation removes the effect of semantic integration, isochronicity—the tendency of key elements to be
but again elicited specific correlated activation in the equally spaced in time.66 This is not an aspect of

Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences 91
published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Music, language, and action hierarchical processing Fitch & Martins

rhythm shared by normal speech or action, although cent imaging studies concerning rhythm percep-
other aspects like metrical structure may represent tion paint a consistent picture:71–73 rhythmic stimuli
hierarchical sequencing.67 Given these basic differ- tend to activate motor areas (such as the supplemen-
ences, we do not expect complete overlap of musical tary motor area, other dorsal prefrontal regions, or
and linguistic processing, either neurally or behav- even the basal ganglia) involved in sequencing,74,75
iorally. Rather, specific parallels in the manipula- and do not activate the IFG. But this work focuses
tions applied to musical and linguistic stimuli (e.g., on isochronous rhythms, and likely reflects the ten-
violations of hierarchical structural expectations33 ) dency of such rhythms to induce a desire to move
are needed to elicit parallel neural effects. to the regular beat—the so-called “groove” of the
Despite the broad consensus concerning a role music.76,77 Precisely because this aspect of music
for the IFG in these aspects of music processing, is not shared with language, we do not consider
both ERP and MEG studies have difficulties with such data to contradict Lashley’s hypothesis or the
localization, and these early findings have not gone well-established notion that the IFG plays a role in
unchallenged by more recent functional magnetic processing other components of musical syntax.
resonance imaging (fMRI) studies.68,69 A particular In summary, a considerable body of neuroimag-
issue concerns group analysis of different subjects. ing data, from many laboratories, is consistent with
It is possible that each individual in a study could the hypothesis that the prefrontal cortex, and the
activate distinct but neighboring regions for lin- IFG in particular, play important parallel roles in
guistic versus musical syntax. Because of individual processing specific shared aspects of language and
variability in precise localization, averaging all these music. These data are consistent with both Lashley’s
individually distinct activations would give a false early ideas and Patel and Koelsch’s more specific hy-
impression of thorough overlap in the two domains. potheses about the nature of the shared processing
A recent study probed this issue by first identi- resources.
fying language regions in individual subjects, and
The role of frontocortical mirror neurons in
then examining activations elicited in those re-
understanding action
gions by various other tasks involving working
memory, cognitive control, and music.70 This study The third component of Lashley’s triad is goal-
used a sentence-reading task, contrasted with read- directed action. There has been an explosion of in-
ing nonsense words, to delineate the language re- terest recently in the role of the frontal lobes, and
gions of interest (ROIs). Although the study found the IFG, in action understanding. This work typi-
considerable specificity to language in IFG activa- cally makes no reference to Lashley’s ideas, but was
tions, a musical task also preferentially activated instead spurred by the discovery of mirror neurons in
their language-based ROIs, in particular overlap- the macaque. Mirror neurons are neurons in motor
ping with activations found in the left posterior and premotor cortex that fire, not only when a mon-
IFG for both the sentence-reading task and a verbal key executes an action itself, but also when it sees
working-memory task. Although the authors con- another agent perform the same action. The dual
cluded from their results that specific language ar- nature of the metaphoric mirror—in both produc-
eas exist, not activated by most other tasks, this is tion and perception—has led to revived interest in
an overly general conclusion applicable only to their the mostly discredited motor theory of speech per-
specific musical and linguistic tasks. It is also impor- ception, which posits that speech perception is me-
tant to recognize that the Patel or Koelsch shared- diated by virtue of a resonance with the motor pro-
resource models57,58 do not predict exact overlap grams that generate the sounds (vocal gestures).78
for these two different tasks (reading visually pre- By this hypothesis, the objects of speech perception
sented sentences versus listening to melodies) given are articulatory motor events, rather than acoustic
the different input modalities. Thus, the significant or auditory events (for a critique, see Ref. 79).
language-area activations that were seen for the mu- Recent speculation concerning mirror neurons
sical task support Lashley’s contention that these two has extended the reach of the motor theory beyond
domains share important processing subsystems. its original domain of speech perception to language
A second source of apparently contradictory evi- more generally. The region of macaque cortex where
dence involves rhythm perception. A series of re- mirror neurons are found—F5—neighbors or

92 Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences

published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Fitch & Martins Music, language, and action hierarchical processing

overlaps the regions constituting the human Broca’s (“hammers,” “pliers”). Such motor-charged stim-
area (BA 44/45). Many studies show activation of uli elicit (pre)motor activation; other more visually
motor and premotor regions during language pro- charged stimuli (e.g., animals) do not.80,91 Recent
duction and comprehension tasks,80,81 thought to studies35 showing motor activation for particular
support the notion that mirror neurons in this components of language provide no special evi-
region play a central role in the parity of pro- dence for motor (or mirror neuron) involvement
duction and perception that typifies all aspects of with language more broadly, because many linguis-
language.82–84 This idea has been used to support tically structured concepts (color, clouds, beauty)
gestural hypotheses for language evolution, whereby have no motor component. Second, those few mir-
initial stages of linguistic syntax and semantics ror neurons that have been directly documented in
evolved in the context of a gestural protolanguage, the human brain (during single-unit recording in
rather than a vocal system.83,85–87 21 preoperative epilepsy patients) were found in the
A detailed critique of mirror neurons or their medial frontal (supplementary motor area) and the
possible importance in language evolution is not temporal (parahippocampal) cortex;93 it is there-
our purpose here (but see Refs. 88–92). However, it fore possible that mirror neurons are much more
is crucial to distinguish Lashley’s hypothesis about widespread in human brains than in macaques. Sim-
action syntax from those linked to mirror neurons ple observation of IFG activation in the human brain
or embodiment. Lashley’s model concerns a general therefore does not specifically implicate mirror neu-
and abstract notion of sequential hierarchy, and is rons, and more direct and precise tests are needed
only tied to motor actions insofar as these demand to distinguish between general frontal involvement
planning and control of a particular type. Lashley’s and mirror neuron involvement.
model has nothing to do with grasping or recog- A key testable prediction of mirror neuron-based
nizing graspable objects, and indeed involves struc- hypotheses follows from the fact that, by defini-
turing entities for which no such low-level action tion, mirror neurons only exist for motor actions
schemata are available. When applied to the role of the subject can actually perform itself. To the extent
the IFG, this leads to different empirical predictions that mirror neurons are required to understand or
from those of mirror-system hypotheses. process some observed action, this predicts that un-
familiar or unproduceable actions should not excite
Action syntax: beyond mirror neurons
the mirror system, and should be uninterpretable
Because localization is central to mirror-based ar- (or at least difficult to process). This is rather dif-
guments, precision regarding specific regions of the ficult to test in the domain of language, because
frontal cortex is key to the following discussion. all normal humans are both producers and per-
The cytoarchitecture and connectivity of neighbor- ceivers of language. Nonetheless, in brain-damaged
ing frontal regions changes drastically over small patients with lesions in proposed sites of mirror
distances. These properties range from agranular neurons (the IFG and parietal regions), speech per-
cortex in motor and premotor regions (BA 4 and ception is not disturbed as predicted.94 This study
6), to dysgranular BA 44, to granular BA 45 and only found impaired speech perception when le-
anterior frontal cortex. The motor cortex generates sions involved auditory regions in the temporal lobe,
strong descending motor pathways (e.g., the cor- inconsistent with motor theories of speech percep-
ticospinal tract), but BA 44/45 are both truly pre- tion and of the need for mirror neurons to support
frontal, and clearly functionally distinct from motor accurate speech perception.
regions. With broad or imprecise spatial localiza- Additional data do not support the prediction
tion, or multisubject averaging of activations, it is that motor familiarity is required to elicit appro-
easy to confuse (or conflate) activations in one of priate processing of speech stimuli. An infant op-
these regions with that of the others. tical imaging study95 found frontal activation to
Starting with motor and premotor activations, speech-based patterns in newborns, despite their
brain-imaging studies reveal clear and consistent lack of experience in producing speech motor pat-
activations for certain classes of words or im- terns themselves. Similarly, an artificial grammar-
ages connoting either direct actions (“throw” or learning study by Bahlmann et al. employed ab-
“kick”) or tools providing affordances for action stract visual patterns, sequentially presented, which

Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences 93
published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Music, language, and action hierarchical processing Fitch & Martins

had no motor (or verbal) component whatsoever. these results do not support the strong inferences
Items were arranged into either context-free pat- some draw about a specific role of Broca’s area in
terns favoring hierarchy or finite-state patterns fa- this task,35 which involves action sequencing but not
voring pure sequence. The authors found activation hierarchical sequencing per se.
of BA 44 only for the first case, supporting a role for
Broca’s area and hierarchical action
this region in sequential hierarchy in the absence of
planning
motor-action understanding.40
Perhaps the most direct test of the mirroring pre- Much more convincing evidence for a specific role
diction is possible with music, where many humans for Broca’s area in hierarchical action, beyond mu-
are devoted music listeners but lack the motor ca- sic and language, comes from an elegant fMRI study
pability to play this music themselves. Mirror neu- by Koechlin and Jubault,97 which introduced a task
ron enthusiasts often cite studies indicating motor designed to discriminate simple sequencing from
involvement and Broca’s activation during mu- hierarchical sequencing. This work built upon the
sic perception by expert musicians on their own computationally explicit model of hierarchical ac-
instruments.35 But the crucial test is actually with tion planning of Dehaene and Changeux,98 which
nonmusicians, where the data are clear: Broca’s ac- recognizes that at least three levels of nesting are
tivation is found during music perception, and mu- needed to clearly discriminate simple gestures, mo-
sical syntax tasks in particular, even in nonmusi- tor sequences, and hierarchical plans. This model
cians, whose understanding of instrumental music is based on the “Tower of London” task introduced
presumably has little or no motor component.37,96 by Shallice,42 which involves moving colored beads
Furthermore, although many findings are consis- on three posts from some starting state to an end
tent with the general notion that music percep- state designated with a target image. This task can be
tion relies partially on motor and action-related smoothly varied in complexity, from an easy variant
circuitry,60,72,75 most of the motor areas implicated where the beads are simply moved directly to the
are neither part of the putative mirror neuron sys- goal, to challenging end states that require five or
tem, nor specifically related to music production more moves, some of which require moving beads
(e.g., singing or playing). away from their final goal state.99 Shallice found that
The dangers of rough localization and charac- patients with left prefrontal damage have difficulties
terization are illustrated in an intriguing paper by specifically with these more complex problems,42
Fazio et al.,54 which examined the abilities of pa- and brain-imaging results are broadly consistent
tients with frontal aphasia to correctly understand with prefrontal involvement in this task.100 De-
event sequences. Previous work81 showed a partic- haene’s and Changeux’s model specifies three lev-
ular deficit for prefrontally damaged patients for els of action: simple motor gestures (e.g., grasping
interpreting action. The Fazio study expanded this a bead), operations (e.g., moving a bead from one
by first showing participants short video sequences post to another), and plans (nested sequences of
of human actions or physical events and then pre- operations to reach some specified goal state). Only
senting still images extracted from the videos on a the planning level provides clear evidence of hierar-
touch screen. The patients’ task was to properly se- chical (nonchained) sequencing, particularly when
quence these images by touching them (requiring it involves intermediate operations leading away
no overt language). The six patients in this study from the goal. When the computational model was
showed a general deficit relative to healthy controls, “lesioned” by removing planning units, these more
but were more impaired for sequencing human ac- difficult problems were specifically impaired, just as
tions (e.g., a man grasping a bottle) than physical in Shallice’s prefrontal patients.
events (e.g., a bicycle falling down). Superimpos- Koechlin and Jubault96 designed an analogous
ing the MRI-determined lesion locations of patients task. Repeated button presses (motor condition) or
upon one another revealed that the only region con- simple, overlearned left—right sequences are pro-
sistently damaged in all patients was centered on BA duced in the scanner, under different stimulus con-
44. However, all but one of these patients also had ditions. In the simple chunk condition, overlearned
much more extensive damage, often stretching back sequences alternated with the motor control se-
into the parietal lobe. Thus, although intriguing, quencing task. But in the superordinate condition,

94 Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences

published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Fitch & Martins Music, language, and action hierarchical processing

chunk sequences were arranged into higher level complex and multileveled the hierarchy, the greater
orders. All sequences were highly overlearned and the capacity of the intermediate storage mechanism
automatized, so no learning occurred during scan- must be.
ning. Although simple motor acts elicited motor From a computational viewpoint, there is no
and premotor activity, higher order actions elicited doubt that effective, succinct computation requires
more anterior activations within Broca’s area, with such additional persistent storage. The most com-
transitions between chunks favoring BA 44, and ter- mon means of achieving it in computer science is via
mination of superordinate chunks activating BA 45. a stack. For example, during the execution of com-
These data provide the clearest evidence to date of a plex computer programs, each call to a subroutine
specific role for Broca’s area in planning hierarchi- places a pointer to the current routine (and often
cally structured action. intermediate results) onto the stack before branch-
We conclude that neuroimaging data support ing to execute the subroutine. Once the subrou-
Lashley’s hypothesis that music, language, and some tine completes its computation, typically returning
types of action planning share a common substrate, some required result, execution of the original call-
and that Broca’s area plays a key role in this dis- ing function can resume where it left off, by popping
tributed system. Although none of these data ex- its pointer back off the stack. This can be embedded
clude a role for mirror neurons in action under- multiple times (with each level of nesting requir-
standing in music or language, they argue against ing additional stack space), and clearly the storage
a necessity for motor knowledge to either under- capacity of the stack limits the possible extent of
stand sensory input or to elicit specific activity in embedding.
the IFG. In contrast, these results remain consistent Both this storage capacity and restrictions on
with Lashley’s more abstract proposals, on the basis access influence the types of computations that
of computational similarities underlying planning are possible. Theoretical computer scientists have
of complex action in all three domains. The burning mapped out, in considerable detail, the computa-
question thus remains concerning what, precisely, tional limitations of registers versus stacks versus
this supramodal system is doing at a computational queues in abstract models of computation called au-
level. tomata. For instance, a finite-state automaton (FSA)
is a system where discrete computational states are
Broca’s area as Lashley’s “scannable linked with no persistent memory. When such a
buffer” system is augmented with additional working mem-
ory structures, new automata classes result. A push-
Lashley emphasized that the core requirement of se- down automaton (PDA), for example, is an FSA with
quential hierarchy is that higher level processes and an additional stack provided, whereas a Turing ma-
goals can be placed on hold during the execution chine is an FSA with an endless queue (unrestricted
of lower level processes. To accomplish such per- order of access). Automata are models of process-
sistence, he suggested, certain neural circuits must ing, but important theorems link each of these types
act as a buffer scannable by other circuits. In com- to specific classes of string sets (“languages”) that
putational terms, sequential hierarchies require the can be generated or recognized by the correspond-
equivalent of a register (holding one element), a ing type of automaton. Unsurprisingly, the less re-
queue (multiple freely accessible elements) or a stack stricted the additional memory store of the automa-
(holding multiple elements, but with restricted last- ton, the broader and less restricted the type of string
in-first-out access), and require that these forms of sets it can handle. These automaton/language pairs
persistent memory be available to other ongoing can thus be arranged into a classification system,
computational processes. Psychologically, these are termed the formal language hierarchy or extended
all different forms of working memory that provide Chomsky hierarchy:101 a fundamental theoretical
temporary storage of intermediate results (analo- concept in computer science.102–104
gous to a scratchpad or blackboard). Although a One appealing version of such stack-based com-
reflex chain of actions, where each action triggers puting would involve domain-specific templates
the next, does not require such persistent interme- and filler information, retrieved and maintained
diate storage, both processing and production of by a multidomain processor. For example the sen-
hierarchical sequences do. Furthermore, the more tence “John likes apples” would involve retrieving
Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences 95
published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Music, language, and action hierarchical processing Fitch & Martins

the template “X likes Y,” along with the specific This hypothesis, which has been entertained in re-
elements “John” and “apples” (which might just lated forms by multiple researchers,52,97,110 extends
as well be “Mary” and “chocolate”). The role of Lashley’s speculations by positing a discrete neu-
the scannable buffer here is simply to maintain the ral locus implementing his scannable buffer and by
current template-element bindings during the pro- specifying its supraregular nature. Direct evidence
cessing of further perceptual information or motor that processing load increases with depth is provided
planning. Such a mechanism could be used to deal by parametric studies,53,111,112 and the localization
with many types of templates, independent of for- aspect to the IFG is consistent with a wealth of neu-
mat (e.g., verbal, melodic, and motor), and thus roimaging data. This hypothesis does not require
would be available for multiple domains. that the IFG is the only site of a scannable neural
Although the distinctions between registers, buffer (similar buffers may be implemented, for ex-
stacks, and queues are a major focus in computer ample, in hippocampus or basal ganglia circuitry).
science, from a psychological or neural point of Nor does this hypothesis suggest that the stack or
view the relevance of such abstractions is less obvi- buffer represents all of the relevant data or compu-
ous. Lashley’s critical point was that hierarchical se- tations: the idea is that it can activate and link repre-
quence processing requires some form of scannable sentations generated in other brain regions (akin in
intermediate storage, and limitations on this capac- computer terms to storing a pointer, rather than all
ity obviously restrict the capacity of a system to of the data). Nonetheless, the involvement of Broca’s
handle hierarchical structures. Formally, the crucial region in hierarchical sequencing in music and lan-
distinction is between FSA-only systems (which can guage, and in some complex action planning, seems
generate the so-called regular languages and have abundantly clear. Koelsch has termed this idea the
only a very limited and inflexible capacity to pro- syntactic equivalence hypothesis.113
cess hierarchy) and the augmented automata above This hypothesis does not follow directly from the
these (e.g., PDAs or Turing machines), which cor- fact that cortical systems for both perception and
respond to the supraregular languages. action are laid out hierarchically. For if each level
A foundational result in computational linguis- of a processing hierarchy mapped directly onto a
tics is that natural languages require processing re- neural hierarchy, implemented on the cortical sur-
sources over and above those of an FSA, and are face, there would be strict limitations on the pos-
thus supraregular.105,106 Considerable debate has sible depth of hierarchical structures. For exam-
surrounded the question of how far beyond reg- ple, the data on hierarchical action planning above
ular the natural languages go; current consensus is suggested a direct mapping from first-level motor
that languages go a bit beyond the capacities of a gestures in posterior (motor and premotor) cortex
PDA and belong to the class termed mildly context- and third-level superordinate planning to anterior
sensitive grammars,107,108 requiring multiple stacks, (BA 45) regions. But prefrontal cortex cannot be
or even a stack of stacks to be computed. With mu- extended forever, and a literal mapping would sug-
sic, the situation is less clear, but some theorists sug- gest a strict limit on the level of embedding possible
gest that music requires computational supraregu- in language, music, or action. Such a limit is not
lar resources including at least one stack109 —the evident in action (even making coffee can be con-
so-called context-free grammars. strued as involving four levels, and if interrupted by
Marrying these computational considerations a phone call, five), nor in music or language. The
with our prior conclusions on the basis of neural well-known restriction of embedding to three levels
localization leads to the following hypothesis. Cor- in language, for example, applies only to a particu-
tical resources in the IFG, including at least BA 44 lar syntactic structure—center embedding—not to
and BA 45, implement a storage buffer scannable syntax or language as a whole.114,115
by other cortical and subcortical circuits subserv- Having thus extended Lashley’s hypothesis to a
ing sequential behavior. This buffer is required specific neural circuit, we turn to Lashley’s final hy-
to implement supraregular hierarchical sequence pothesis: that the capacity to process musical and
processing, and its processing load increases with linguistic hierarchical sequences is unusually devel-
the depth and complexity of the hierarchy being oped in humans, but builds upon and extends pre-
processed. existing action sequencing mechanisms present in

96 Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences

published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Fitch & Martins Music, language, and action hierarchical processing

animals. Addressing this issue demands an exami- empirical testing of these notions challenging, but
nation of comparative data. by no means impossible.
Returning to animal research, existing studies
The phylogenetic roots of hierarchical
paint a relatively consistent picture. Many species,
sequencing: comparative data
from rats to songbirds to baboons, have well-
Many comparative studies have addressed ani- developed sequencing capabilities that can be char-
mals’ ability to produce and perceive hierarchical acterized as finite state systems. Such systems, for
structure,116 mostly in the auditory domain but also example, appear adequate to capture the syntac-
exploring visual capabilities.45,117 Much of the re- tic structure seen in birdsong116 or the recognition
cent animal work has adopted the computational of simple long-distance dependency in squirrel
principles outlined above, contrasting finite-state monkeys.124 In contrast, the capacity of animals
grammars with context-free grammars to see which to induce supraregular grammars remains con-
(if any) animal species can go beyond the regular tentious. Despite numerous examples of successful
grammar level. induction of regular grammars, only a few species
This comparative work built upon human ex- have been claimed to go beyond this,116 and both
perimental psychology research begun by George the data and the methods in these studies have been
Miller in the late 1950s, in one of the first com- sharply questioned.125–127 Perhaps the most con-
puterized testing laboratories at Harvard.118 Rule- vincing attempt to show supraregular abilities in a
learning problems were generated and tested at dif- nonhuman species came from Gentner et al. in their
ferent computational levels. The results led Miller operant conditioning work128 with starlings (Stur-
to put forth the following hypothesis, which we call nus vulgaris), songbirds with complex song and ex-
the supraregular hypothesis: adult humans have a tended vocal learning in both females and males.129
proclivity to induce rule systems at context-free (or Gentner’s et al. used a supraregular grammar de-
higher) levels, even when the data do not require noted An Bn , which means that the number of A
such systems. That is, even stimulus sets that could units must be precisely matched by the number of
be captured by regular grammars tend to be cap- B units (in this case the A and B elements were
tured using supraregular grammars. two types of starling song motifs, termed warbles
Miller’s hypothesis suggests not only that humans and rattles, presented acoustically). This grammar
are able to go beyond regular grammars, but also is provably supraregular, and allows multiple possi-
that human observers are biased to view sequences ble structural interpretations ranging from count-
as hierarchically structured, even when the generat- and-compare, to center-embedded, to cross-serial
ing algorithm is a serial, finite-state system. Because structure.121 After very extensive training (between
all stimulus sets generated by finite-state grammars 9400 and 56,200 trials per bird, with a mean of
can also be parsed by context-free grammar, this 30,000 trials), three of four starlings tested with
may be a cognitive example of the aphorism “to a various probe strings showed evidence of having
man with a hammer everything looks like a nail.” We acquired the supraregular language An Bn . This sug-
might think of this bias as a form of dendrophilia— gested that, although quite difficult, it was at least
an inordinate fondness for tree structures—in our possible for some birds to acquire a supraregular
species.119 rule. However, a follow-up study in zebra finches125
Both hierarchy and supraregularity are clear and showed that such levels of performance could be ex-
well defined mathematically. Unfortunately, it is plained by simpler rules and averaging together of
not trivial to adjudicate between finite-state and the performance of different birds.
context-free requirements on the basis of string pro- On the basis of these and other results (re-
cessing alone.120 Structure is an invisible (mental) viewed in Ref. 116), we can tentatively formulate a
construct, and can only be empirically probed indi- comparative extension of Miller’s human-oriented
rectly (e.g., by observing acceptance and rejection of supraregular hypothesis—the supraregular excep-
well-chosen string sets,121 by adding prosodic cues tionality hypothesis, which posits that, in contrast
at or between putative phrase boundaries,122 or by to humans, nonhuman animals display a particu-
observing reactions to clicks played at or between lar difficulty inducing hierarchical patterns from
putative boundaries123 ). Such indirectness renders strings of data, especially any structures requiring

Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences 97
published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Music, language, and action hierarchical processing Fitch & Martins

computational resources above the finite-state mixing up multiple computational abilities as if they
level. were one. Our concern in this review is the process-
Evidence that an organism fails on suprareg- ing of hierarchical sequences, defined as sequences
ular grammars (e.g., An Bn , mirror grammars, or with a tree-formed structure in which element or-
copy grammars101 ), although succeeding on reg- der matters. Such structures are best processed by
ular grammars (e.g., (AB)n , AB*A, or sequential supraregular systems. Unfortunately, much of the
transition–probability grammars130,131 ) constitutes last decade’s research based on formal language the-
support for this hypothesis. Unlike Miller’s human- ory has conflated hierarchy and supraregularity, in
specific supraregular hypothesis, which has abun- these precise senses, with various notions of recur-
dant support from human studies,40,118,132–134 our sion. Recursion refers to that broad class of com-
supraregular exceptionality hypothesis, which con- putational processes in which a function calls itself
cerns nonhuman animals, remains tentative for sev- (in computer science), or to those structures where
eral reasons. First, nearly every comparative study the same hierarchical structure is repeated at mul-
has used a different species, and only a handful of tiple levels of the hierarchy (“self-embedding”). An
nonhuman species have been tested to date. This example of a recursive structure is a fractal, where
renders broad generalizations about animals im- the same structure repeats itself ad infinitum at all
possible. Even statements about one species re- hierarchical levels.139
main tentative: multiple empirical issues need to Crucially, although all recursive trees are hierar-
be considered in pattern-learning experiments that chical, all hierarchies are not recursive. For many
make it difficult for any single study to control types of hierarchy discussed in this review, such as
for all interpretations. Finally, very few grammars motor actions embedded within plans, it is unclear
have been tested (indeed, all animal-based claims what self-embedding would even mean. Although a
for supraregularity are based on a single grammar context-free grammar can be implemented recur-
An Bn ), which renders generalizations to broad for- sively, for finite sets it need not be, and regular
mal classes impossible. Although humans succeed grammars can also be implemented recursively. The
on multiple supraregular grammars (including copy well-defined supraregular divide, between finite-
and mirror grammars), to our knowledge no one has state automata and those supraregular automata
tested such grammars in nonhumans. Firm conclu- that have some additional scannable memory, is
sions will require more research, on different species thus orthogonal to the issue of recursion. This
and different grammars, and replications across issue has now been repeatedly discussed in the
laboratories. literature,121,140–144 so we will not belabor it here.
Nonetheless, these comparative data provide ini- We can only hope that the unfortunate current ten-
tial support for Lashley’s notion that a hierarchi- dency to substitute the ambiguous high-profile buz-
cal syntax of action is particularly well developed zword recursive for distinct (and more empirically
in our species. It seems too early to draw conclu- tractable) concepts like supraregular or hierarchical
sions concerning the degree to which this procliv- will eventually run its course.
ity builds upon preexisting supraregular abilities
in animals.135–137 Certainly, we have no evidence Evolutionary expansion of Broca’s region
of action syntax in animals even approaching the in humans
complexity of making a pot of coffee: the closest
would be certain instances of tool use in some chim- We end with a brief review of recent comparative
panzee populations.138 Before returning to the pos- anatomical data pointing to a general expansion of
sible neural basis for expanded hierarchical abilities prefrontal connectivity in our species, along with
in humans, we need to briefly forestall misunder- a more specific and pronounced increase in the
standing by noting an area of pervasive confusion size of Broca’s area and changes in its pattern of
in the recent comparative literature. connections.
The human brain is greatly expanded in size
Recursion and supraregularity: an
relative to other primates, including our near-
unfortunate confusion
est relatives the great apes (chimpanzees, goril-
For unclear reasons much of the comparative liter- las, and orangutans).145 However, ours are not the
ature just discussed conflates several distinct issues, largest brains in the animal kingdom: the brains of

98 Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences

published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Fitch & Martins Music, language, and action hierarchical processing

elephants and toothed whales like dolphins or orcas or white) are significantly and disproportionately
are larger.146 Nor do human brains occupy the great- larger than predicted depends on the comparison
est relative proportion of total body weight: that group (apes or monkeys). But differences from non-
distinction goes to small mammals like shrews.147,148 ape primates are established, whereas differences
It is only when brain size is considered rela- from apes remain controversial.154–157 Furthermore,
tive to what would be predicted, given body size absolute frontal volume increase is undisputed, and
(the so-called encephalization quotient), that hu- relative white matter increases seem relatively clear.
man brains reign supreme.149 In these terms, al- To what extent do these increases in size and con-
though great apes have relatively large brains for nectivity apply equally to all prefrontal regions? This
mammals, humans are still quite exceptional, with question is challenging to answer, because it requires
a roughly threefold increase in total brain size rel- that multiple brains are analyzed using detailed cy-
ative to chimpanzees (or early fossil hominids like toarchitectonic methods. In apes, such analysis has
Australopithecus150 ). to date been performed only for prefrontal areas BA
Recent comparative neuroanatomical work al- 10 (frontal pole), 44, and 45 (along with V1 and
lows us to go beyond these long-known facts to BA 13, part of the insula). Comparisons between
inquire whether human brain expansion affected the brains of 12 chimpanzees158 with previous work
specific brain regions and circuits. Although hu- in humans159 revealed a striking finding: areas 44
man frontal regions, like the cortex overall, are much and 45 are the most greatly expanded cortical areas
larger than in other primates, whether this increase yet identified in humans. Left areas 44 and 45 are
is disproportionate to overall size increase remains six times larger in humans than in chimpanzees,158
contentious. One of the first studies to use MRI in disproportionate to the roughly threefold increase
living primates to address this issue151 compared in total brain size, or the 4.5-fold increase in frontal
the volume of the frontal lobe in four great ape cortex in total. This important study shows that
species with humans, along with lesser ape (gibbon) Broca’s area in particular has expanded dispropor-
and monkey (Macaca and Cebus) brains. They con- tionately since our divergence from chimpanzees
cluded that great apes in general have an expanded (roughly 6 million years ago).
frontal cortex relative to gibbons or monkeys, but Regarding connectivity, a recent study exploited
frontal lobe volume was not disproportionately dif- the technique of diffusion tensor imaging (DTI),
ferent from humans relative to total brain volume. which uses MRI to estimate, in intact brains, ax-
This is consistent with many findings that apes are onal connectivity between brain regions.160 Ap-
cognitively superior to monkeys.152 However, this plying DTI to human, macaque, and chimpanzee
study looked at the entire frontal lobe, as demarcated brains, Rilling et al.161 found significant changes
by the central sulcus (thus including both motor and in the pattern of connectivity between Broca’s area
anterior areas). and posterior brain regions. Humans, and to some
To examine the relative size of the prefrontal cor- degree chimpanzees, showed connectivity between
tex specifically, Schoeneman et al. used MRIs of 11 prefrontal and temporal regions via a dorsal path-
primate species to examine the brain volume ante- way through the parietal cortex, but only in humans
rior to the corpus callosum.153 A separate analysis did this pathway (the arcuate fasiculus) have a strong
of gray and white matter found a striking dissocia- uninterrupted connection to the posterior temporal
tion between these two components: human relative cortex. In contrast, rhesus macaques showed a very
gray volumes differed significantly from only a few weak dorsal pathway, whereas ventral connections
species, whereas white volumes were greater than all between the temporal cortex and frontal regions via
but two species. A subsequent analysis of gray:white the insula dominated. These results again converge
size ratios indicates that any human expansion of on the conclusion that it is not simply the size of
the prefrontal cortex relative to total brain volume prefrontal regions that has increased in humans; the
is almost entirely the result of an increase in white pattern of connectivity has changed as well.162
matter. This suggests that, in addition to their raw
Conclusion: Broca meets Lashley
increase in absolute size, human prefrontal regions
have become disproportionately connected to the Combining the behaviorally anomalous status of
rest of cortex. Whether human frontal lobes (gray human music and language relative to animal

Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences 99
published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Music, language, and action hierarchical processing Fitch & Martins

communication systems with the comparative primates). But Lashley’s ideas bring into focus the
anatomical data and imaging data reviewed above, precise cognitive and computational changes that
we begin to see the outlines of a neurally and biolog- might underlie our expanded capacities for plan-
ically grounded hypothesis concerning the undeni- ning, action, language, music, and thought itself,
able differences between humans and other animals in the context of a wide array of shared capacities.
that does justice to the very deep cognitive and neu- This is, by hypothesis, our broadly developed ability
ral foundations that we share with other vertebrates and indeed proclivity to structure both action and
(including other primates). For each of Lashley’s perception hierarchically. The clear distinctions be-
three questions, laid out in the introduction, cur- tween currently popular mirror neuron-based hy-
rent data support a tentative positive answer: pro- potheses and the more abstract computational hy-
cessing hierarchical sequences appears to be a well- pothesis of Lashley have the potential to drive more
defined and neutrally localizable function, shared refined experimental procedures and to provide fur-
by musical and linguistic syntax (Q1), humans are ther empirical evidence relevant to understanding
unusually well developed in this ability (Q3), and it Broca’s region and its function. We conclude that,
is at least plausible to suggest that this ability may although Lashley has been rarely cited in the large
inherit key components from some type of action number of studies implicating Broca’s region in
syntax that predated the evolution of human music action, music, and language, his ideas have with-
and language (Q2). stood the passage of time well, and deserve renewed
From a modern perspective, Lashley’s action syn- attention.
tax can be localized to a set of widespread brain
Acknowledgments
circuits that have, as a key hub, prefrontal re-
gions centered on Broca’s region. The prefrontal We thank Daniel Bowling, Bruno Gingras, Ste-
regions play a primitive role in the hierarchical fan Koelsch, Marisa Hoeschele, and two anony-
planning and sequencing of action, a function mous reviewers for comments on a previous version
presumably shared with other primates, and par- of this manuscript. We also acknowledge support
ticularly with chimpanzees, whose consistent and from ERC Advanced Grant #230604 “SOMACCA”
intelligent use of tools has no peer among other and University of Vienna Research Cluster Grant
nonhuman primates.163–165 However, the apparent “Shared Neural Resources” (to W.T.F.), and FCT
remit of this type of hierarchical planning increased Grant SFRH/BD/64206/2009 (M.M.).
greatly during human evolution, to include both
perception and production of all types of hierarchi- Conflicts of interest
cal sequences. The most prominent additions to this The authors declare no conflicts of interest.
Broca-centered action sequencing capacity were, by
hypothesis, those two great human achievements: References
music and language. 1. Lashley, K. 1951. The problem of serial order in behavior.
From an evolutionary viewpoint, this idea pro- In Cerebral Mechanisms in Behavior; the Hixon Symposium.
vides for a certain continuity between humans and L.A. Jeffress, Ed.: 112–146. New York: Wiley.
2. Rosenbaum, D.A., et al. 2007. The problem of serial order
other primates (particularly chimpanzees) while ac- in behavior: Lashley’s legacy. Hum. Mov. Sci. 26: 525–554.
knowledging the drastic increases in certain cogni- 3. Chomsky, N. 1959. A note on phrase structure grammars.
tive capacities that have occurred since our diver- Inform. Control. 2: 393–395.
gence with chimpanzees. They also make sense of 4. Chomsky, N. 1968. Language and Mind. New York: Har-
the fact that although prefrontal cortex and Broca’s court, Brace & World.
5. Upshall, M. 1993. Hutchinson Dictionary of Mathematics.
area are not novel brain regions, they have expanded London: Brockhampton Press.
disproportionately and have changed their patterns 6. Illingworth, V. 1983. Dictinary of Computing. Oxford: Ox-
of connectivity to other brain regions during recent ford University Press.
human evolution. In proposing this, we are fully 7. Altmann, C.F., H.H. Bülthoff & Z. Kourtzi. 2003. Perceptual
aware that Broca’s region is by no means the sole organization of local elements into global shapes in the
human visual cortex. Curr. Biol. 13: 342–349.
seat of language or music. Both capacities rely on a 8. Aminoff, E., N. Gronau & M. Bar. 2006. The parahip-
far-flung network of both cortical and subcortical pocampal cortex mediates spatial and nonspatial associa-
brain regions (all of them, again, shared with other tions. Cereb. Cortex 17: 1493–1503.

100 Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences

published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Fitch & Martins Music, language, and action hierarchical processing

9. Badre, D. 2008. Cognitive control, hierarchy, and the 29. Friederici, A.D. 2002. Towards a neural basis of auditory
rostro–caudal organization of the frontal lobes. Trends sentence processing. Trends Cogn. Sci. 6: 78–84.
Cogn. Sci. 12: 193–200. 30. Friederici, A.D. 2011. The brain basis of language
10. Badre, D. & M. D’Esposito. 2009. Is the rostro-caudal axis processing: from structure to function. Physiol. Rev.
of the frontal lobe hierarchical? Nat. Rev. Neurosci. 10: 659– 91: 1357–1392.
669. 31. Petersson, K.M., C. Forkstam & M. Ingvar. 2004. Artificial
11. Badre, D., et al. 2009. Hierarchical cognitive control deficits syntactic violations activate Broca’s region. Cogn. Sci. 28:
following damage to the human frontal lobe. Nat. Neurosci. 383–407.
12: 515–522. 32. Forkstam, C., et al. 2006. Neural correlates of artificial syn-
12. Kikuchi, Y., B. Horwitz & M. Mishkin. 2010. Hierarchical tactic structure classification. Neuroimage 32: 956–967.
auditory processing directed rostrally along the monkey’s 33. Koelsch, S., et al. 2013. Processing of hierarchical syntactic
supratemporal plane. J. Neurosci. 30: 13021–13030. structure in music. Proc. Natl. Acad. Sci. U S A. 110: 15443–
13. Kourtzi, Z., et al. 2003. Integration of local features into 15448.
global shapes: monkey and human fMRI studies. Neuron 34. Brown, S., M.J. Martinez & L.M. Parsons. 2006. Music and
37: 333–346. language side by side in the brain: a PET study of the
14. Kravitz, D.J., et al. 2013. The ventral visual pathway: an generation of melodies and sentences. Eur. J. Neurosci. 23:
expanded neural framework for the processing of object 2791–2803.
quality. Trends Cogn. Sci. 17: 26–49. 35. Fadiga, L., L. Craighero & A. D’Ausilio. 2009. Broca’s area
15. Krumbholz, K., et al. 2005. Hierarchical processing of in language, action, and music. Ann. New York Acad. Sci.
sound location and motion in the human brainstem and 1169: 448–458.
planum temporale. Eur. J. Neurosci. 21: 230–238. 36. Koelsch, S., B. Maess & A.D. Friederici. 2000. Musical syntax
16. Mormann, F., et al. 2008. Latency and selectivity of sin- is processed in the area of Broca: an MEG study. Neuroim-
gle neurons indicate hierarchical processing in the human age. 11: 56.
medial temporal lobe. J. Neurosci. 28: 8865–8872. 37. Maess, B., et al. 2001. Musical syntax is processed in Broca`s
17. Kravitz, D.J., et al. 2011. A new neural framework for visu- area: an MEG study. Nat.Neurosci. 4: 540–545.
ospatial processing. Nat. Rev. Neurosci. 12: 217–230. 38. Patel, A.D., et al. 2008. Musical syntactic processing in
18. Kumaran, D., H.L. Melo & E. Duzel. 2012. The emergence agrammatic Broca’s aphasia. Aphasiology 22: 776–789.
and representation of knowledge about social and nonso- 39. Sammler, D., S. Koelsch & A.D. Friederici. 2011. Are
cial hierarchies. Neuron 76: 653–666. left fronto-temporal brain areas a prerequisite for normal
19. Schendan, H.E., et al. 2003. An fMRI study of the role of music-syntactic processing? Cortex 47: 659–673.
the medial temporal lobe in implicit and explicit sequence 40. Bahlmann, J., et al. 2009. Neural circuits of hierarchical
learning. Neuron 37: 1013–1025. visuo-spatial sequence processing. Brain Res. 1298: 161–
20. Meyer, P., et al. 2005. Language processing within the hu- 170.
man medial temporal lobe. Hippocampus. 15: 451–459. 41. Luria, A.R. 1966. Higher Cortical Functions in Man. New
21. Opitz, B. & A.D. Friederici. 2003. Interactions of the hip- York: Basic Books.
pocampal system and the prefrontal cortex in learning 42. Shallice, T. 1982. Specific impairments of planning. Philos.
language-like rules. Neuroimage. 19: 1730–1737. Trans. R. Soc. B. 298: 199–209.
22. Opitz, B. & A.D. Friederici. 2007. Neural basis of process- 43. Baddeley, A.D. 1986. Working Memory. Oxford: Clarendon
ing sequential and hierarchical syntactic structures. Hum. Press.
Brain Mapping. 28: 585–592. 44. Passingham, R.E. 1993. The frontal lobes and voluntary ac-
23. Eichenbaum, H., et al. 2012. Towards a functional organi- tion. Oxford, UK: Oxford University Press.
zation of episodic memory in the medial temporal lobe. 45. Conway, C.M. & M.H. Christiansen. 2001. Sequential
Neurosci. Biobehav. Rev. 36: 1597–1608. learning in non-human primates. Trends Cogn. Sci. 5: 539–
24. Ranganath, C. 2010. A unified framework for the functional 546.
organization of the medial temporal lobes and the phe- 46. Passingham, R.E. 1981. Broca’s area and the origins of hu-
nomenology of episodic memory. Hippocampus 20: 1263– man vocal skill. Philos. Trans. R. Soc. B. 292: 167–175.
1290. 47. Bookheimer, S. 2002. Functional MRI of language: new
25. Hassabis, D., et al. 2007. Patients with hippocampal amne- approaches to understanding the cortical organization of
sia cannot imagine new experiences. Proc. Natl. Acad. Sci. semantic processing. Annu. Rev. Neurosci. 25: 151–188.
U S A. 104: 1726–1731. 48. Friederici, A.D. 2009. Pathways to language: fiber tracts in
26. Baddeley, A. 2000. The episodic buffer: a new component the human brain. Trends Cogn. Sci. 13: 175–181.
of working memory? Trends Cogn. Sci. 4: 417–423. 49. Zurif, E.R., A. Caramazza & R. Myerson. 1972. Grammat-
27. Henke, K. 2010. A model for memory systems based on ical judgments of agrammatic aphasics. Neuropsychologia
processing modes rather than consciousness. Nat. Rev. Neu- 10(4): 405–417.
rosci. 11: 523–532. 50. Hagoort, P. 2005. On Broca, brain, and binding: a new
28. Amunts, K., et al. 2010. Broca’s region: novel organizational framework. Trends Cogn. Sci. 9: 416–423.
principles and multiple receptor mapping. PLoS Biol. 8: 51. Hickok, G. & D. Poeppel. 2007. The cortical organization
1000489. of speech processing. Nat. Rev. Neurosci. 8: 393–402.

Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences 101
published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Music, language, and action hierarchical processing Fitch & Martins

52. Pulvermüller, F. 2010. Brain embodiment of syntax and vestigated using functional magnetic resonance imaging. J.
grammar: discrete combinatorial mechanisms spelt out in Neurosci. 31: 3843–3852.
neuronal circuits. Brain Lang. 112: 167–179. 70. Fedorenko, E., M.K. Behr & N. Kanwisher. 2011. Functional
53. Pallier, C., A.-D. Devauchelle & S. Dehaene. 2011. Cortical specificity for high-level linguistic processing in the human
representation of the constituent structure of sentences. brain. Proc. Natl. Acad. Sci. U S A. 108: 16428–16433.
Proc. Natl. Acad. Sci. U S A. 108: 2522–2527. 71. Chen, J.L., R.J. Zatorre & V.B. Penhune. 2008. Moving on
54. Fazio, P., et al. 2009. Encoding of human action in Broca’s time: brain network for auditory-motor synchronization is
area. Brain 132: 1980–1988. modulated by rhythm complexity and musical training. J.
55. Thompson-Schill, S.L. 2005. Dissecting the language or- Cogn. Neurosci. 20: 226–239.
gan: a new look at the role of Broca’s area in language 72. Grahn, J.A. 2012. Neural mechanisms of rhythm percep-
processing. In Twenty-First Century Psycholinguistics: Four tion: current findings and future perspectives. Top. Cogn.
Cornerstones. A. Cutler, Ed.: 173–190. London: Lawrence Sci. 4: 585–606.
Erlbaum. 73. Grahn, J.A. & M. Brett. 2007. Rhythm and beat perception
56. Thompson-Schill, S.L., et al. 1997. Role of left infe- in motor areas of the brain. J. Cogn. Neurosci. 19: 893–906.
rior prefrontal cortex in retrieval of semantic knowl- 74. Wymbs, N.F. & S.T. Grafton. 2013. Contributions from
edge: a reevaluation. Proc. Natl. Acad. Sci. U S A. the left PMd and the SMA during sequence retrieval as
94: 14792–14797. determined by depth of training. Exp. Brain Res. 224: 49–
57. Patel, A.D. 2013. Sharing and nonsharing of brain resources 58.
for language and music. In Language, Music, and the Brain: 75. Janata, P. & S.T. Grafton. 2003. Swinging in the brain: shared
A Mysterious Relationship. M.A. Arbib, Ed.: 329–355. Cam- neural substrates for behaviors related to sequencing and
bridge, Massachusetts: MIT Press. music. Nat. Neurosci. 6: 682–687.
58. Koelsch, S. 2013. Neural correlates of music perception. 76. Madison, G. 2006. Experiencing groove induced by music:
In Language, Music, and the Brain: A Mysterious Relation- consistency and phenomenology. Music Percept. 24: 201–
ship, Vol. 10. M.A. Arbib, Ed.: 141–172. Cambridge, Mas- 208.
sachusetts: MIT Press. 77. Janata, P., S.T. Tomic & J.M. Haberman. 2012. Sensorimotor
59. Janata, P. & L.M. Parsons. 2013. Neural mechanisms of coupling in music and the psychology of the groove. J. Exp.
music, singing, and dancing. In Language, Music, and the Psychol.: General. 141: 54–75.
Brain: A Mysterious Relationship, Vol. 10. M.A. Arbib, Ed.: 78. Liberman, A.M. & I.G. Mattingly. 1985. The motor theory
307–328. Cambridge, Massachusetts: MIT Press. of speech perception revised. Cognition 21: 1–36.
60. Herholz, S.C., A.R. Halpern & R.J. Zatorre. 2012. Neuronal 79. Diehl, R.L., A.J. Lotto & L.L. Holt. 2004. Speech perception.
correlates of perception, imagery, and memory for familiar Ann. Rev. Psychol. 55: 149–179.
tunes. J. Cogn. Neurosci. 24: 1382–1397. 80. Martin, A., et al. 1996. Neural correlates of category-specific
61. Peretz, I. & M. Coltheart. 2003. Modularity of music pro- knowledge. Nature 379: 649–652.
cessing. Nat. Neurosci. 6: 688–691. 81. Tranel, D., et al. 2003. Neural correlate of conceptual knowl-
62. Patel, A.D. 1998. Syntactic processing in language and mu- edge for actions. Cogn. Neuropsychol. 20: 409–432.
sic: different cognitive operations, similar neural resources? 82. Arbib, M.A. 2002. The mirror system, imitation, and the
Musical Percept. 16: 27–42. evolution of language. In Imitation in Animals and Artifacts.
63. Koelsch, S., et al. 2005. Interaction between syntax pro- C. Nehaniv & K. Dautenhahn, Eds.: 229–280. Cambridge,
cessing in language and in music: an ERP study. J. Cogn. MA: MIT Press.
Neurosci. 17: 1565–1577. 83. Arbib, M.A. 2005. From monkey-like action recognition to
64. Slevc, L.R., J.C. Rosenberg & A.D. Patel. 2009. Making psy- human language: an evolutionary framework for neurolin-
cholinguistics musical: self-paced reading time evidence for guistics. Behav. Brain Sci. 28: 105–167.
shared processing of linguistic and musical syntax. Psychon. 84. Rizzolatti, G. & M.A. Arbib. 1998. Language within our
Bull. Rev. 16: 374–381. grasp. Trends Neurosci. 21: 188–194.
65. Steinbeis, N. & S. Koelsch. 2008. Shared neural resources 85. Condillac, É.B.D. 1971 (1747). Essai sur l’origine des Con-
between music and language indicate semantic processing naissances Humaines. Gainesville, FL: Scholar’s Facsimiles
of musical tension-resolution patterns. Cereb. Cortex. 18: and Reprints.
1169–1178. 86. Corballis, M.C. 2002. From Hand to Mouth: the Origins of
66. Fitch, W.T. 2006. The biology and evolution of music: a Language. Princeton: Princeton University Press.
comparative perspective. Cognition 100: 173–215. 87. Hewes, G.W. 1973. Primate communication and the gestu-
67. Fitch, W.T. 2013. Rhythmic cognition in humans and an- ral origin of language. Curr. Anthropol. 14: 5–24.
imals: distinguishing meter and pulse perception. Front. 88. Emmorey, K. 2005. Sign languages are problematic for a
Syst. Neurosci. 7: 1–16. gestural origins theory of language evolution. Behav. Brain
68. Abrams, D.A., et al. 2011. Decoding temporal structure Sci. 28: 130–131.
in music and speech relies on shared brain resources but 89. Kendon, A. 1991. Some considerations for a theory of lan-
elicits different fine-scale spatial patterns. Cereb. Cortex. 21: guage origins. Man 26: 199–221.
1507–1518. 90. MacNeilage, P.F. & B.L. Davis. 2005. The frame/content
69. Rogalsky, C., et al. 2011. Functional anatomy of language theory of evolution of speech: a comparison with a gestural-
and music perception: temporal and structural factors in- origins alternative. Interact. Stud. 6: 173–199.

102 Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences

published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Fitch & Martins Music, language, and action hierarchical processing

91. Toni, I., et al. 2008. Language beyond action. J. Physiol. 102: 112. Petersson, K.M., V. Folia & P. Hagoort. 2010. What artificial
71–79. grammar learning reveals about the neurobiology of syntax.
92. Fitch, W.T. 2010. The Evolution of Language. Cambridge: Brain Lang. 120: 83–95.
Cambridge University Press. 113. Koelsch, S. 2012. Brain and Music. London, UK: John Wiley
93. Mukamel, R., et al. 2010. Single-neuron responses in hu- & Sons.
mans during execution and observation of actions. Curr. 114. Miller, G.A. & N. Chomsky. 1963. Finitary models of lan-
Biol. 20: 750–756. guage users. In Handbook of Mathematical Psychology. Vol.
94. Rogalsky, C., et al. 2011. Are mirror neurons the basis of 35. R.D. Luce, R.R. Bush & E. Galanter, Eds.: 419–492. New
speech perception? Evidence from five cases with damage York: John Wiley & Sons.
to the purported human mirror system. Neurocase 17: 178– 115. Bach, E., C. Brown & W. Marslen-Wilson. 1986. Crossed
187. and nested dependencies in German and Dutch: a psy-
95. Gervain, J., et al. 2008. The neonate brain detects speech cholinguistic study. Lang. Cogn. Process. 1: 249–262.
structure. Proc. Natl. Acad. Sci. U S A. 105: 14222–14227. 116. ten Cate, C. & K. Okanoya. 2012. Revisiting the syntactic
96. Koelsch, S., et al. 2002. Bach speaks: a cortical “language- abilities of non-human animals: natural vocalizations and
network” serves the processing of music. NeuroImage. 17: artificial grammar learning. Philos. Trans. R. Soc. B. 367:
956–966. 1984–1994.
97. Koechlin, E. & T. Jubault. 2006. Broca’s area and the hier- 117. Stobbe, N., et al. 2012. Visual artificial grammar learning:
archical organization of human behavior. Neuron 50: 963– comparative research on humans, kea (Nestor notabilis)
974. and pigeons (Columba livia). Philos. Trans. R. Soc. B. 367:
98. Dehaene, S. & J.-P. Changeux. 1997. A hierarchical neu- 1995–2006.
ronal network for planning behavior. Proc. Natl. Acad. Sci. 118. Miller, G.A. 1967. Project Grammarama. In Psychology of
U S A. 94: 13293–13298. Communication. G.A. Miller, Ed. New York: Basic Books.
99. Berg, W.K., et al. 2010. Deconstructing the tower: param- 119. Fitch, W.T., A.D. Friederici & P. Hagoort. 2012. Pattern
eters and predictors of problem difficulty on the Tower of perception and computational complexity. Philos. Trans.
London task. Brain Cogn. 72: 472–482. R. Soc. B. 367: 1925–1932.
100. Newman, S.D., J.A. Greco & D. Lee. 2009. An fMRI study 120. Fitch, W.T. Toward a computational framework for cog-
of the Tower of London: a look at problem structure differ- nitive biology: unifying approaches from cognitive neuro-
ences. Brain Res. 1286: 123–132. science and comparative cognition. Phys. Life Rev. In press.
101. Jäger, G. & J. Rogers. 2012. Formal language theory: refining 121. Fitch, W.T. & A.D. Friederici. 2012. Artificial grammar
the Chomsky Hierarchy. Philos. Trans. R. Soc. B. 267: 1956– learning meets formal language theory: an overview. Philos.
1970. Trans. R. Soc. B. 367: 1933–1955.
102. Gersting, J.L. 1999. Mathematical Structures for Computer 122. Morgan, J.L., R.P. Meier & E.L. Newport. 1987. Struc-
Science. New York: W H Freeman. tural packaging in the input to language learning: con-
103. Hopcroft, J.E., R. Motwani & J.D. Ullman. 2000. Intro- tributions of prosodic and morphological marking of
duction to Automata Theory, Languages and Computation. phrases to the acquisition of language. Cogn. Psychol.
Reading, Massachusetts: Addison-Wesley. 19: 498–550.
104. Linz, P. 2001. An Introduction to Formal Languages and 123. Fodor, J.A., T.G. Bever & M.F. Garrett. 1974. The Psychol-
Automata. Sudbury, Massachusetts: Jones & Bartlett. ogy of Language: An Introduction to Psycholinguistics and
105. Chomsky, N. 1957. Syntactic Structures. The Hague: Generative Grammar. New York: McGraw-Hill.
Mouton. 124. Ravignani, A., et al. 2013. Action at a distance: dependency
106. Levelt, W.J.M. 2008. Formal Grammars in Linguistics and sensitivity in a New World primate. Biol. Lett. 9: 20130852.
Psycholinguistics. Amsterdam: John Benjamins. 125. van Heijningen, C.A.A., et al. 2009. Simple rules can explain
107. Stabler, E.P. 2004. Varieties of crossing dependencies: struc- discrimination of putative recursive syntactic structures by
ture dependence and mild context sensitivity. Cogn. Sci. 28: a songbird species. Proc. Natl. Acad. Sci. U S A. 106: 20538–
699–720. 20543.
108. Stabler, E.P. 2013. The epicenter of linguistic behavior. In 126. Beckers, G.J.L., et al. 2012. Birdsong neurolinguistics: song-
Language Down the Garden Path: The Cognitive and Biolog- bird context-free grammar claim is premature. NeuroRe-
ical Basis of Linguistic Structures. M. Sanz, I. Laka & M.K. port. 23: 139–145.
Tanenhaus, Eds.: 316–323. New York: Oxford University 127. Petkov, C.I. & E.D. Jarvis. 2012. Birds, primates, and spoken
Press. language origins: behavioral phenotypes and neurobiolog-
109. Rohrmeier, M. 2011. Towards a generative syntax of tonal ical substrates. Front. Evol. Neurosci. 4: e12.
harmony. J. Math. Music. 5: 35–53. 128. Gentner, T.Q., et al. 2006. Recursive syntactic pattern learn-
110. Uddén, J. & J. Bahlmann. 2012. A rostro-caudal gradient of ing by songbirds. Nature 440: 1204–1207.
structured sequence processing in the left inferior frontal 129. Hausberger, M., et al. 1995. Song sharing reflects the so-
gyrus. Philos. Trans. R. Soc. Lond. B Biol. Sci. 367: 2023– cial organization in a captive group of European starlings
2032. (Sturnus vulgaris). J. Comp. Psychol. 109: 222–241.
111. Braver, T.S., et al. 1997. A Parametric Study of Prefrontal 130. Saffran, J., et al. 2008. Grammatical pattern learning by
Cortex Involvement in Human Working Memory. Neu- human infants and cotton-top tamarin monkeys. Cognition
roImage. 5: 49–52. 107: 479–500.

Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences 103
published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.
Music, language, and action hierarchical processing Fitch & Martins

131. Saffran, J.R., R.N. Aslin & E.L. Newport. 1996. Statistical 150. Holloway, R.L. 1996. Evolution of the human brain. In
learning by 8-month-old infants. Science 274: 1926–1928. Handbook of Human Symbolic Evolution. A. Lock & C.R.
132. de Vries, M.H., et al. 2008. Syntactic structure and artificial Peters, Eds.: 74–108. Oxford: Clarendon Press.
grammar learning: the learnability of embedded hierarchi- 151. Semendeferi, K., et al. 2002. Humans and great apes share
cal structures. Cognition 107: 763–774. a large frontal cortex. Nat. Neurosci. 5: 272–276.
133. Fitch, W.T. & M.D. Hauser. 2004. Computational con- 152. Deaner, R.O., C.P. van Schaik & V. Johnson. 2006. Do some
straints on syntactic processing in a nonhuman primate. taxa have better domain-general cognition than others? A
Science 303: 377–380. meta-analysis of nonhuman primate studies. Evol. Psychol.
134. Uddén, J., et al. 2012. Implicit acquisition of grammars 4: 149–196.
with crossed and nested non-adjacent dependencies: in- 153. Schoenemann, P.T., M.J. Sheehan & L.D. Glotzer. 2005.
vestigating the push-down stack model. Cogn. Sci. 2012: Prefrontal white matter volume is disproportionately larger
1–24. in humans than in other primates. Nat. Neurosci. 8: 242–
135. Byrne, R.W. 2007. Clues to the origin of the human mind 253.
from primate observational field data. Jpn. J. Anim. Psychol. 154. Barton, R.A. & C. Venditti. 2013. Human frontal lobes are
57: 1–14. not relatively large. Proc. Natl. Acad. Sci. U S A. 110: 9000–
136. Hauser, M., N. Chomsky & W.T. Fitch. 2002. The language 9006.
faculty: what is it, who has it, and how did it evolve? Science 155. Sakai, T., et al. 2011. Differential prefrontal white matter
298: 1569–1579. development in chimpanzees and humans. Curr. Biol. 21:
137. Johnson-Pynn, J., et al. 1999. Strategies used to combine 1397–1402.
seriated cups by chimpanzees (Pan troglodytes), bonobos 156. Sherwood, C.C., R.L. Holloway, K. Semendeferi &
(Pan paniscus), and capuchins (Cebus apella). J. Comp. Psy- P.R. Hof. 2005. Is prefrontal white matter enlarge-
chol. 113: 137–148. ment a human evolutionary specialization? Nat. Neurosci.
138. Sanz, C., J. Call & D.B. Morgan. 2009. Design complex- 8: 537–538.
ity in termite-fishing tools of chimpanzees. Biol. Lett. 157. Smaers, J.B., et al. 2010. Frontal white matter volume
5: 293–296. is associated with brain enlargement and higher struc-
139. Mandelbrot, B.B. 1977. The fractal geometry of nature. tural connectivity in anthropoid primates. PLoS One.
New York: Freeman. 5: e9123.
140. O’Donnell, T.J., M.D. Hauser & W.T. Fitch. 2005. Using 158. Schenker, N.M., et al. 2010. Broca’s area homologue
mathematical models of language experimentally. Trends in chimpanzees (Pan troglodytes): probabilistic mapping,
Cogn. Sci. 9: 284–289. asymmetry and comparison to humans. Cereb. Cortex. 20:
141. Fitch, W.T. 2010. Three meanings of “recursion”: key dis- 730–742.
tinctions for biolinguistics. In The Evolution of Human Lan- 159. Amunts, K., et al. 1999. Broca’s region revisited: cytoarchi-
guage: Biolinguistic Perspectives. R. Larson, V. Déprez & H. tecture and intersubject variability. J. Comp. Neurol. 412:
Yamakido, Eds.: 73–90. Cambridge, UK: Cambridge Uni- 319–341.
versity Press. 160. Koch, M.A., D.G. Norris & M. Hund-Georgiadis. 2002.
142. Fitch, W.T., M.D. Hauser & N. Chomsky. 2005. The evolu- An investigation of functional and anatomical connectivity
tion of the language faculty: clarifications and Implications. using magnetic resonance imaging. Neuroimage 16: 241–
Cognition. 97: 179–210. 250.
143. Martins, M.D. 2012. Specific signatures of recursion. Philos. 161. Rilling, J.K., et al. 2008. The evolution of the arcuate fas-
Trans. R. Soc. B. 367: 2055–2064. ciculus revealed with comparative DTI. Nat. Neurosci. 11:
144. Van der Hulst, H. 2010. Recursion and Human Language. 426–428.
Berlin: De Gruyter/Mouton. 162. Catani, M. & M. Mesulam. 2008. The arcuate fasciculus and
145. Huxley, T.H. 1863. Man’s Place in Nature. Mineola, NY: the disconnection theme in language and aphasia: history
Dover. and current state. Cortex 44: 953–961.
146. Jerison, H.J. 1961. Quantitative analysis of evolution of the 163. Goodall, J. 1986. The Chimpanzees of Gombe: Patterns of
brain in mammals. Science 133: 1012–1014. Behavior. Cambridge, Massachusetts: Harvard University
147. Stephan, H., H. Frahm & G. Baron. 1981. New and revised Press.
data on volumes of brain structures in insectivores and 164. McGrew, W.C. & L.F. Marchant. 2001. Ethological stud-
primates. Folia. Primatol. 35: 1–29. ies of manual laterality in the chimpanzees of the Mahale
148. Deacon, T.W. 1990. Rethinking mammalian brain evolu- Mountains, Tanzania. Behaviour 138: 329–358.
tion. Am. Zool. 30: 629–705. 165. Sugiyama, Y. & J. Koman. 1979. Tool-using and tool-
149. Jerison, H.J. 1973. Evolution of the Brain and Intelligence. making behavior in wild chimpanzees at Bossou, Guinea.
New York: Academic Press. Primates 20: 513–524.

104 Ann. N.Y. Acad. Sci. 1316 (2014) 87–104 C 2014 The Authors. Annals of the New York Academy of Sciences

published by Wiley Periodicals Inc. on behalf of The New York Academy of Sciences.

You might also like