You are on page 1of 90

On the Proto-Indo-European Language of the

Indus Valley Civilization


(and Its Implications for Western Prehistory)*

by Robin Bradley Kar

(Draft dated August 3, 2012)

In this Article, I will be arguing that, despite certain well-known and


long-standing controversies over the issue, we are already in a good enough
position to conclude—with a very high degree of confidence—that the Indus
Valley Civilization (a.k.a. the “Harappan” or “Sindhu-Sarasvati”
Civilization)1 spoke dialects of Proto-Indo-European. My arguments for this
conclusion will be new, and will draw upon a body of evidence that has so far
been overlooked in these discussions. The evidence itself has, however, been
available for some time now. It has simply passed beneath our notice because
we have not been looking at it squarely enough in the eye, and because it
needs to be reassembled in new ways to make its evidentiary import more
apparent.
In arguing for this claim, I will be taking a position that is currently a
minority one in most scholarly discussions in the West. I do not, however,
take this position lightly or without careful thought. As a professor of law and
moral philosophy, my entryway into these topics is also somewhat
unorthodox. Before presenting a roadmap of the main discussion, I would
therefore like to explain my growing interest in these topics.
One of my primary lines of research has been on the evolutionary origins
of moral and legal systems and the processes by which they tend to diverge

                                                                                                               
*
Sections 1 and 2 of this piece are revised versions of work that has been published in the
Illinois Law Review. I would like to thank the Illinois Law Review for their permission to
reprint these parts of my work in this format.
1
I will use the term “Indus Valley” Civilization in this piece because this Civilization
arose in the region that is currently known in the scholarly literature as the Indus Valley. I
would also like to use a terminology that does not prejudge the important question whether the
Vedic “Sarasvati” was an integral part of this Civilization. For reasons discussed below, I
nevertheless believe that the Vedic Sarasvati was ultimately an integral part of the Indus Valley
Civilization, and so I am sympathetic to the use of “Sindhu-Sarasvati Civilization” that has
been adopted in other parts of this book.

  1
once human beings begin to move from relatively small and simple (typically
hunter-gatherer) forms of subsistence into much more complex, populous and
highly differentiated social systems2—much like the one we find beginning in
about 3500 BC in the Indus Valley.3 If one considers the larger record of our
natural history as a species, then developments like these turn out to be
extraordinarily rare and to arise only very late in the story (only within the last
10,000 or so years, during the Holocene and after the rise of agriculture); and
yet developments like these ultimately produced both the Indus Valley
Civilization and what we typically think of as Western Civilization. The early
developments in the Indus Valley are typically thought to be irrelevant to the
origins of social complexity in the West, however, because Western legal and
political traditions are typically said to have a distinctive history, which began
only much later in ancient Greece, Rome and Israel.4 When trying to decipher
the keys to Western success, scholars have therefore tended to look almost
exclusively for certain features that differentiate Western traditions from their
counterparts in the East.5
Like many people, I also have a special interest in understanding
Western institutions—not only because I want to know how to help them
prosper in their own terms but also because we live in a world that has been
witnessing the increased exportation of these institutions to many other parts
                                                                                                               
2
See, e.g., Robin Bradley Kar, The Deep Structure of Law and Morality, 84 Texas Law
Review 877 (2006); Robin Bradley Kar, How Evolutionary Theory Can Both Vindicate and
Debunk Morality (With a Special Nod to the Growing Importance of Law), EVOLUTION AND
MORALITY: NOMOS LII (forthcoming NYU Press 2012), manuscript available at
http://ssrn.com/abstract=1834965; Robin Bradley Kar, On the Early Eastern Origins of
Western Law and Western Civilization: New Arguments for a Changed Understanding of Our
Legal and Cultural Prehistory, University of Illinois Law Review vol. 2012 (forthcoming
2012, available at http://ssrn.com/abstract=2039500 (Part 1), http://ssrn.com/abstract=2039502
(Part 2), http://ssrn.com/abstract=2039504 (Part 3); Robin Bradley Kar, Outcasting,
Globalization, and the Emergence of International Law, 121 Yale Law Journal On-Line 411
(2012).
3
I use this 3500 BC date because it marks the beginning of the incipient urban phase of
this Civilization’s development. See BRIDGET ALLCHIN & RAYMOND ALLCHIN, ORIGINS OF A
CIVILIZATION: THE PREHISTORY AND EARLY ARCHAEOLOGY OF SOUTH ASIA 113–83 (1997).
4
See, e.g., ELIAS BICKERMAN & MORTON SMITH, THE ANCIENT HISTORY OF
WESTERN CIVILIZATION 12 (1976) (“Thus the history of Greece, Israel, and Rome is our
own past. . . . To put it in a nutshell, the Greco-Roman civilization shaped the agricultural
civilization of the following centuries, of which our industrial civilization is the direct
continuation.”); see also id. (claiming that the modern world “derived most of its present
culture from the Arabic and European heirs of Greece, Rome, and Israel”); JACK
GOLDSTONE, WHY EUROPE?: THE RISE OF THE WEST IN WORLD HISTORY, 1500-
1850, at vii (2009) (“For most of the nineteenth and twentieth centuries, students learned about
. . . the story of the ‘rise of the West.’ This story started with the emergence of democracy and
philosophy in ancient Greece and Rome; continued with the rule of Europe’s kings and knights
in the Middle Ages; moved on to the arts and explorations of the Renaissance; and concluded
with the military, economic, and political domination of the world by the nations of Western
Europe and North America. The peoples of . . . Asia were mentioned only when they
encountered European explorers or colonizers—their ‘history’ thus beginning with European
contact and conquest.”); see also HAROLD J. BERMAN, LAW AND REVOLUTION: THE
FORMATION OF THE WESTERN LEGAL TRADITION at 3 (1983)
5
I have developed all of these ideas at much greater length in Robin Bradley Kar, On the
Early Eastern Origins of Western Law and Western Civilization: New Arguments for a
Changed Understanding of Our Earliest Legal and Cultural Origins, University of Illinois Law
Review vol. 2012 (forthcoming 2012), available at http://ssrn.com/abstract=2039500 (Part 1),
http://ssrn.com/abstract=2039502 (Part 2), http://ssrn.com/abstract=2039504 (Part 3).

  2
of the world. Often, these exportations have been justified, at least in part, by
the purportedly distinctive capacities of Western institutions to promote
political and economic development along with certain values like liberty,
social stability, human rights, and the rule of law. Yet many of our attempts
to spread this particular path to success have been failing, sometimes quite
spectacularly, and it has become increasingly apparent that we may not fully
understand what makes Western institutions work in the first place. In the
course of my wider explorations of the early origins of moral and legal
systems, I have begun to detect some anomalies in our traditional
understandings of human prehistory, moreover, and have begun to suspect
that some of our failures to produce sustainable forms of political and
economic development may be related to a more fundamental
misunderstanding that is embedded in our traditional understandings. I have
begun to suspect—in particular—that we have been seeking explanations of
Western success in an overly narrow range of phenomena because we have
failed to recognize that Western Civilization is just one branch of a much
broader family of shared, Indo-European traditions, which are highly relevant
to the emergence and stability of social complexity,6 but which go much
further back in time than has commonly been recognized. For reasons that I
will explain, I believe that the Indus Valley Civilization (along with some of
its direct precursors in the region) have, moreover, very likely played a
critical but almost completely underappreciated role in helping to shape some
of these early Indo-European traditions.
If we take a broader look at world history, we will see that the larger
family of Indo-European traditions has also produced a quite lengthy and
spectacular record of large-scale civilizations, many of which began to
emerge in non-Western parts of Eurasia several millennia before the rise of
the West, and many of which have exhibited unprecedented political and
economic success for their time.7 Hence, the West may not be an altogether
unique member of this family, and if we want to obtain a better understanding
of Western success, we may need to refocus some of our research efforts and
begin looking for certain causally efficacious features that Western traditions
share with this broader class of traditions. To do so would, however, require
a fairly radical reorientation of a number of modern research projects in
economics, history, law and political science—especially insofar as they seek
to explain Indo-European patterns of social, economic, legal and political
development. One of my primary reasons for taking an interest in these topics
is that I want to know whether such a reorientation is warranted. So how
might one go about answering this question?

                                                                                                               
6
When I use the term “social complexity,” I will be using the typical hunter-gatherer band
as a standard against which this complexity can be measured. Relative to the typical hunter-
gatherer band, a society will be said to have more “social complexity” to the degree that it
manages increased populations and population densities and exhibits increased levels of social
hierarchy, division of labor, and professional and political specialization. It is—of course—
well known that Western Civilization is an Indo-European civilization, but it has not typically
been thought that Proto-Indo-European traditions have a deep history relevant to the production
and stability of any forms of social complexity beyond those inherent in tribal pastoralist forms
of life.
7
See Peter Turchin, A Theory for Formation of Large Empires, 4 J. GLOBAL HIST. 191,
202–03 tbl.2 (2009).

  3
As it turns out, there is an important relationship between this last
question and the main topic of this article, which is the language (or
languages) of the Indus Valley Civilization. We know that the Indus Valley
Civilization entered into its first period of incipient urbanism around 3500
BC, and then reached its peak from about 2600 BC until about 1900 BC.8
During this period (which long predates the rise of ancient Greece, Rome or
Israel), the Indus Valley Civilization represented one of the very first and only
successful transformations that we humans had ever made from tribal social
structures into large-scale urban civilizations (which incidentally tend to
include legal—or at least incipient legal—traditions). It would have therefore
been producing a very special body of cultural and political traditions relevant
to sustaining social complexity. The Indus Valley Civilization was also one
of the three most powerful political and economic centers in the world by far
(the other two being in ancient Egypt and Mesopotamia). Ultimately, the
Indus Valley Civilization came and went before any of the conventionally
recognized Indo-European civilizations show up in the historical record. The
traditional story assumes that there can be no plausible connection between
these two sets of developments, however, for a simple reason: it assumes that
the Indus Valley Civilization was not itself an Indo-European society and that
it instead spoke a non-Indo-European language (or languages). If this were
true, then we would indeed be right to study the six main traditions of Indo-
European social complexity—viz., the Persian, Greek, Indian, Roman,
Germanic, and Russian traditions—independently and as arising from six
distinctive cultural and historical processes. But if this traditional linguistic
assumption is wrong, then we have been on a misguided path for some time
now and have been failing to understand an important part of the shared
prehistory of these later developments. Hence, we are led directly back to our
main linguistic question—but now with a better understanding of how certain
traditional assumptions about the Indo-European languages and their
prehistory have been shaping a number of modern research programs aimed at
understanding the main causes and conditions of social, economic, legal and
political development in the West.
Over the course of this article, I will be developing several new
arguments, and compiling several familiar bodies of evidence in new ways,
that will challenge these traditional assumptions and suggest that the Indus
Valley Civilization very likely spoke dialects of Proto-Indo-European.9 Here
                                                                                                               
8
ALLCHIN & ALLCHIN, supra note 3, at 113–83 (describing three periods: period of
agricultural expansion into Indus Valley beginning in 4500 BC; period of Early Harappan
Incipient Urbanism (3500 BC–2600 BC); and mature Harappan period (2600 BC–1900 BC).
See also id. at 183–205 (describing the people and culture of the Harappan period).
9
I should emphasize two important points about this proposal, so as to prevent an
important source of possible misunderstanding. First, this linguistic proposal—at least as I will
be developing it—should be understood as fully consistent with the well-known archaeological
record from this time and region, which shows a steep decline in social complexity (from
around 1900 BC until around 1500 BC) along with heightened forms of regionalization in
material culture. See generally ALLCHIN & ALLCHIN, supra note 3, at 206–22 (“Changing
Scenes: Indus to Ganges”). This same record reveals a subsequent set of movements of so-
called “Vedic” civilization (which was undisputedly Indo-European) eastwards, along the
Gangetic plain, where it ultimately gave birth to a second round of urban civilization beginning
in about the 6th century BC. See id. at 215 (“The major change [in the post-urban period in the
Sarasvati valley] is the great increase in the number of settlements spreading out across the

  4
is how I will proceed. Section 1 will begin by noting that major language
families (like the Indo-European language family) are actually a very late
development within our natural history as a species. To reconstruct Indo-
European prehistory, it will therefore help to understand how major language
families might have emerged in the first place. I will address this question by
introducing a contemporary model of prehistoric linguistic expansion, which I
call the “riverine-agricultural” model and which I have been developing in my
other work. I will then describe an extensive body of empirical evidence in
support of this model. This evidence shows that certain major riverine
topographies have played a critical role in the early expansion of all of the
major language families that currently dominate the world—including the
Indo-European language family.
Section 2 will then trace out some of the implications of this new model
of prehistoric linguistic expansion for our understanding of the language (or
languages) that were most likely spoken by the Indus Valley Civilization.
The model suggests that, during the height of the Indus Valley Civilization,
the languages spoken in this region would have almost certainly represented
one of the most important and monumental linguistic phenomena ever to have
arisen within our natural history as a species. If we assume (plausibly) that
significant pockets of this language family should therefore remain in the
northwestern portions of the Indian subcontinent, then—as I will show in
Section 2—the Indus Valley Civilization must have spoken dialects of Proto-
Indo-European.
Section 3 will then consider the objection that tries to reject this last
conclusion by rejecting its guiding assumption (i.e., that significant pockets of
the Indus Valley Civilization’s language family should still remain in the
northwestern parts of the Indian subcontinent). According to this objection,
small groups of Indo-Aryan invaders or migrants from the steppes could have
simply eradicated the pre-existing language (or languages) of the Indus Valley
Civilization by converting the prior populations to Indo-Aryan languages. As
against this objection, I will present a wealth of empirical evidence from
around the world and over the course of world history, which establishes an
important fact: once a major linguistic phenomenon has reached equilibrium
around a major riverine topography in accordance with the riverine-
agricultural model of linguistic expansion, there is not one recorded case
anywhere in this extensive world historical record where the language family
                                                                                                                                                                                                                                                                       
plains in the eastern part of the area, in a belt between 100–200 km in width, running from
north-west to south-east, following the edge of the Himalayan foothills.”). The only point is
that we will need to understand this entire complex archaeological record as reflecting a history
of cyclical expansions and contractions in the social complexity of a single branch of the Indo-
European language family. Second, there is nothing about these linguistic claims—at least as I
will be will be elaborating them—that should commit us to the view that the Indus Valley
Civilization gave birth to all of the other branches of the Indo-European family (i.e., in the
sense of being their direct linguistic, cultural and/or genetic ancestor). For reasons discussed
below, I believe it is, in fact, ultimately much more plausible to think that Proto-Indo-European
dialects were spoken throughout a larger region (which I call the “Eastern-Iran-Bactria-Indus-
Valley” region) from an even earlier time. This larger region would have included the one that
gave birth to the Indus Valley Civilization, but many of the western branches of the Indo-
European family would have more plausibly separated from regions nearer ancient Bactria or
modern day Iran. On the view I will be proposing, these other branches are therefore best
understood as related to the Indus Valley Civilization by common descent, not parentage.

  5
in question has been completely replaced in one of these regions by a different
language family through a process of linguistic conversion. (I will define
what I mean by many of these terms, such as “reaching equilibrium in
accordance with the riverine-agricultural model of linguistic expansion,” and
will discuss seeming counterexamples in regions like the New World in
Section 3 as well.) To the contrary, the world historical record presents an
extraordinarily extensive and near constant stream of counterexamples in
which invading and/or migrating groups have been incapable of fully
converting the local inhabitants in these particular riverine regions once a
major language family has emerged and reached equilibrium. We therefore
have strong empirical reasons to reject this objection.
Section 4 will then discuss another common source of resistance to the
claim that the Indus Valley Civilization spoke dialects of Proto-Indo-
European. It is common to perceive this claim as carrying with it certain
further implications about Indo-European prehistory that many have deemed
difficult to square with the broader body of evidence relevant to this larger
topic. In order to address this concern, Section 4 will therefore embed the
current linguistic thesis within a broader narrative concerning Indo-European
prehistory that is—I will argue—actually better able to explain (or at least
render coherent) this broader body of evidence than its main competitors.
Once embedded in the right narrative of Indo-European prehistory, this
article’s linguistic claim will therefore be shown to have a much broader and
more extensive form of evidentiary support.
Section 5, finally, will end with a response to some of Michael Witzel’s
important and influential work, which purports not only to establish an Indo-
Aryan invasion or migration theory but also to trace with some precision the
exact timing and path of the Indo-Iranian groups who (in his view) first
brought Indo-European languages and cultures from the steppes to the Indian
subcontinent. Although some have tried to dismiss Witzel’s evidence, I
believe he has collected an incredibly important body of evidence for the
present topic. I will nevertheless argue that this evidence ultimately
underdetermines the choice between his traditional theory and the new one
developed here. In order to choose between these two theories, we will
therefore need to compare their overall explanatory power in relation to the
entire body of evidence relevant to this topic, and—especially once the new
considerations developed in this article have been taken into account—this
comparison strongly favors the current theory. In fact, I will suggest that
Witzel’s evidence can itself be understood as adding a further layer of support
to the current theory, once his evidence has been disambiguated so as to
harmonize with this larger body of evidence.

1. Explaining the Early Origins of the Major Language Families

I begin with an observation. At the eve of the Holocene, and prior to the
development of agriculture, all of our current evidence suggests that humans
tended to live in relatively small hunter-gatherer bands, which were usually
nomadic and typically consisted of somewhere between 30 to 50 people (and

  6
probably rarely exceeded about 300).10 Around the world, human populations
were much smaller than they are today, and we know that conditions like
these tend to produce fairly extreme levels of linguistic diversity.11
If conditions like these had persisted, then there would have been no
question at all about the origins of a major language family like Indo-
European because there would have been no linguistic phenomena anywhere
in the world with this kind of breadth or expansive reach. The Indo-European
language family is currently the largest language family in the world: it has
approximately 2.8 billion native speakers12 (more than one half of which,
incidentally, live within the northern parts of the Indian subcontinent and
speak languages that fall within the single Indo-Aryan branch of this family).
The contemporary world is, moreover, different from the world of our
ancestors in a more general sense: most of the linguistic diversity that once
characterized our smaller, more mobile and less populous forms of social life
has been replaced by a handful of major language families with very
extensive populations and geographic reach.13 If we define a “major language
family” as any set of languages for which historical linguists can
uncontroversially reconstruct a common ancestor using standard comparative
methods (and so within at most approximately 8,000 years),14 and which are
                                                                                                               
10
See JARED DIAMOND, GUNS, GERMS AND STEEL: THE FATES OF HUMAN SOCIETIES 267
(1997) (“Bands are typically the smallest of societies, consisting typically of 5 to 80 people,
most or all of them close relatives by birth or by marriage”); Richard B. Lee & Richard Daly,
Foragers and Others, in THE CAMBRIDGE ENCYCLOPEDIA OF HUNTERS AND GATHERERS 1
(Richard B. Lee & Richard Daly eds., 1999) (noting that hunter-gatherer bands sometimes
aggregate to numbers as large as 200 or 300); id. at 4 (“Hunter-gatherers are generally peoples
who have lived until recently without the overarching discipline imposed by the state. They
have lived in relatively small groups, without centralized authority, standing armies, or
bureaucratic systems.”) (“Until 12,000 years ago virtually all humanity lived as hunters and
gatherers.”); ROBERT BOYD & JOAN B. SILK, HOW HUMANS EVOLVED 369 (“We know that
people lived in small-scale foraging societies for the vast majority of human history; stratified
societies with agriculture and high population density have existed for only a few thousand
years.”).
11
JOHANNA NICHOLS, LINGUISTIC DIVERSITY IN SPACE AND TIME 275 (1992).
12
ETHNOLOGUE: LANGUAGES OF THE WORLD 19 tbl.5, at 27 (M. Paul Lewis ed., 16th ed.
2009) [hereinafter ETHNOLOGUE] (listing 2,721,969,619 native Indo-European speakers,
whereas the next largest language family (Sino-Tibetan) has only 1,259,227,250 native
speakers).
13
See THE ATLAS OF LANGUAGES 20–21 (Bernard Comrie et al. eds., rev. ed. 2010) (2003)
(showing a map of world languages, with most of the settled areas being dominated by
languages that fall into one of the following major language families: Indo-European, Uralic,
Altaic, Afro-Asiatic, Nilo-Saharan, Niger-Congo, Dravidian, Sino-Tibetan, Austronesian, and
Austro-Asiatic).
14
Johanna Nichols uses the term “stock” to refer to what I am calling a “major language
family.” She explains that the term “stock” refers to:

a grouping of about the diversity and time depth of Indo-European, exhibiting


correspondences which are regular (though often not transparent to non-specialists),
substantial cognate vocabulary, and significant cognate paradigmaticity in grammar. The
time depth aimed at is over 5000 years (Indo-European is some 6000 years old). The
stock is the highest level reconstructible by the standard comparative method. The upper
limit of what I call a stock is represented by Afroasiatic, some 8000 years old.

NICHOLS, supra note 11, at 24–25 (1992); see also Bellwood, supra note 102, at 17, 22 (“[N]o
coherent language families can be convincingly demonstrated to have existed for much more
than 8000 to 10,000 years.”).

  7
spoken by at least twenty million persons across geographic regions spanning
at least 150,000 square miles, 15 then there would have been no major
language families at all before the development of agriculture. Instead, there
would have been an extremely large number of much smaller language
families, no one of which would have made up a sizeable percentage of the
world’s population. The modern world, by contrast, contains eleven major
language families, which together make up more than 95% of the world’s
population.16 When trying to reconstruct the prehistory of a major language
family like the Indo-European one, it will therefore help to understand how
the world’s major language families could have emerged from our earlier
linguistic situation in the first place.
In my recent work, I have been developing a model of prehistoric
linguistic expansion that helps to answer this last question. The model builds
on some of the most important insights from Colin Renfrew’s well-known
“agricultural-expansionist” model (which asserts the importance of the spread
of agriculture, and resulting increases in population density, for the expansion
of major language families)17 and Joanna Nichols’s well-known “prestige-
based” model (which observes that major language families often spread from
certain centers of political and economic power through certain geographic
regions that she calls “linguistic spread zones.”).18 Neither of these previous
models has proven capable of explaining all of the relevant linguistic data on
its own, however, and I have suggested that what is missing from both of
these models is a clearer recognition of the role that certain major river
systems appear to have played in allowing a handful of early agricultural
societies to become much more powerful, interconnected, populous and
expansive centers of social and linguistic coordination. These developments
have also tended to include periods of urbanization and the creation of robust
systems of urban networks.19 I call the model that develops these insights,
and combines them with the best features of the earlier two models, the
“riverine-agricultural model” of linguistic expansion—though it should
always be remembered that urbanization is an important part of the model as
well.
The riverine-agricultural model is still new, and so I want to begin by
introducing some of its basic features and providing it with some initial

                                                                                                               
15
I am therefore excluding Japonic, which has 123 million members, ETHNOLOGUE,
supra note 12, at 30 tbl.5, but is located primarily on the Islands of Japan, which are only
145,882 square miles. THE WORLD ALMANAC AND BOOK OF FACTS 2000, at 812 (1999).
16
Ethnologue suggests that the largest eleven language families—which include Indo-
European, Afro-Asiatic, Austronesian, Niger-Congo, Sino-Tibetan, Altaic, Dravidian, Tai-
Kadai, Nilo-Saharan, Uralic, and Austro-Asiatic—made up 95.34% of the native speakers in
the world in 2009. See ETHNOLOGUE, supra note 12, tbls. 4 & 5, at 27-32.
17
For an excellent collection of essays discussing this model of linguistic expansion, see
generally EXAMINING THE FARMING/LANGUAGE DISPERSAL HYPOTHESIS (Peter Bellwood &
Colin Renfrew eds., 2002); see also L. Luca Cavalli-Sforza, Genetic and Cultural Diversity in
Europe, 53 J. ANTHROPOLOGICAL RES. 383, 386 (1997); E. M. Wijsman & L. L. Cavalli-Sforza,
Migration and Genetic Population Structure with Special Reference to Humans, 15 ANN. REV.
ECOLOGY & SYSTEMATICS 279, 296 (1984).
18
See generally NICHOLS, supra note 11.
19
See PAUL BAROCH, CITIES AND ECONOMIC DEVELOPMENT: FROM THE DAWN OF HISTORY
TO THE PRESENT 1-107 (1988).

  8
theoretical motivation. I will then discuss the empirical evidence that
supports the model.
Even before the rise of agriculture, major river systems would have
exerted a powerful gravitational pull on many of our earliest hunter-gatherer
ancestors. Rivers are an important source of drinkable water, which is
obviously vital for sustaining human life. Drinkable water can also attract
many of the animals that our ancestors tended to hunt, and many existing
hunter-gatherers still use riverine areas (along with related phenomena like
lakes and watering holes) as prime hunting grounds.20 Rivers can, finally,
help sustain many of the types of vegetation that are useful for foraging. It
should therefore come as little surprise that—according to our best
archaeological evidence—riverine valleys have often attracted many of our
earliest human ancestors.21
With the first developments of agriculture, however, certain major river
systems would have begun to play a much more robust role in coordinating
increasingly complex forms of social life. As an initial matter, agricultural
communities can use rivers as important sources of water for crops—either by
planting crops in the relevant flood plains at the appropriate seasonal times, or
by diverting river water by means of irrigation (and sometimes even storing it
for use in drier seasons). River systems can thus support much larger
increases in agricultural productivity and predictability—thereby amplifying
the ordinary effects of agriculture on population density and social structure.
There is, moreover, an important fact about the value of human language,
which is especially important for the present analysis and would have
certainly held true for the greater part of human prehistory: the value of being
competent in a language to any particular person would have been frequency-
dependent. By this, I mean that this value would have depended in large part
on the frequency with which that person was likely to encounter other people
with whom he or she either needed or wanted to speak but could only do so
with the particular language in question. When coupled with the low
population densities and nomadic lifestyles of most early hunter-gatherers, it
is this fact that ultimately explains why our earliest human ancestors did not
tend to produce any major language families prior to the Holocene.
If, however, agriculture on its own tends to support the production of
larger food surpluses and larger population densities, then the amplification of

                                                                                                               
20
See, e.g., ALAN BARNARD, HUNTERS AND HERDERS OF SOUTHERN AFRICA: A
COMPARATIVE ETHNOGRAPHY OF THE KHOISAN PEOPLES 43–44, 54–55, 224 (1992) (noting that
bands tend to orient themselves around various watering holes, when water is otherwise
difficult to obtain); id. at 102–03 (“The difference is that the annual cycle for many individuals
also includes travel to better-watered areas outside the Reserve.”); see also Colin Renfrew,
‘The Emerging Synthesis’: The Archaeogenetics of Farming/Language Dispersals and Other
Spread Zones, in EXAMINING THE FARMING/LANGUAGE DISPERSAL HYPOTHESIS, supra note 102,
at 3, 8–9 (discussing the contact-induced language shift in bands), at 11 (“At the same time it is
recognized that some coastal, riverine or lacustrine locations will in favourable circumstances
permit a much larger population density of fisher-hunter-gatherers than would otherwise be the
case, and that such areas have to be treated with particular attention to this factor.”).
21
See, e.g., BARNARD, supra note 6, at 43–44, 55, 102, 224; Brian M. Fagan, Peopling of
the Globe, in OXFORD COMPANION, at 330–31 (“[E]ach group [in Eurasia after the last glacial
maximum] centered on a river valley where game was most plentiful, and where plant foods
and fish could be found during the short summers.”); Renfrew, supra note 20, at 11.

  9
these processes by the combination of agriculture around a major river system
should have created the preconditions needed for increased transitions toward
much denser and more coordinated exchange-based economies with increased
forms of urbanization. Before the invention of the wheeled wagon (which
begins to show up robustly in the archaeological record of the steppes around
3300 BC),22 and before the domestication of animals that could efficiently
haul heavy packages over long distances (such as the horse and the camel,
which were most likely domesticated around 4200 BC and somewhere
between 3500 BC and 2500 BC, respectively),23 the technologies needed to
engage in the long-distance transportation and exchange of bulk goods would
have been largely missing,24 and river systems would have provided one of
the very few avenues for the efficient transportation of bulk goods: by boat.
Major river systems should have therefore provided some of the very first
potential nerve centers for the development of increasingly complex societies
with much more robust forms of specialization and division of labor, much
larger and more interdependent populations, and much greater capacities for
political and economic growth.25 Given the frequency-dependent value of
language, these regions should have also tended to produce some of the first
and most important major linguistic phenomena to arise within our natural
history as a species. (By a “major linguistic phenomenon,” I mean to refer
here to a language family that is orders of magnitude larger than those that
characterized the greater part of our human prehistory and that is also
showing regular and predictable tendencies to develop into a contemporary
“major language family”—as that term was defined earlier. These linguistic
phenomena need not actually result in a major language family, however, to
qualify for this definition.)
Figure 1 presents a stylized depiction of the processes under discussion.
Notice that it contrasts three basic classes of phenomena. The first are the
predicted effects of robust agricultural production around a major river system
on the production of major language families. (This first class of effects is
represented by the large and expanding socio-linguistic complex that is
centered around the major river in the top right hand quadrant of the diagram.
For reasons that will be discussed momentarily, this sociocultural complex
also includes certain linguistically related nomadic pastoralist groups, which
                                                                                                               
22
DAVID W. ANTHONY, THE HORSE, THE WHEEL AND LANGUAGE: HOW BRONZE-AGE
RIDERS FROM THE EURASIAN STEPPES SHAPED THE MODERN WORLD 200, at 65–75, 300–30
(2007).
23
Id. at 223 (noting that tribal herders probably rode horses before 4000 BCE); Sheila
Hamilton-Dyer, Domestication of the Camel, in OXFORD COMPANION, supra note 21, at 114,
114–15 (“The dromedary appears first to have been domesticated in the southern Arabian
Peninsula. Between 3000 and 2500 B.C., it is suggested that coastal peoples there switched
from hunting camels to herding them for their milk.”).
24
See, e.g., Andrew Sherratt, Use of Animals for Transportation, in OXFORD
COMPANION, supra note 21, at 382 (“In most areas such animal-drawn vehicles were important
only for short-distance transport, either for social purposes or . . . agricultural use. In the
absence of a well-maintained road network, . . . they were of little use for transporting goods
over long distances.”).
25
See BAROCH, supra note 19, at 14 (“This explains why cities first emerged in fertile
regions and generally downstream in river basins, which permitted a reduction in transportation
costs.”); MASAHISA FUJITA, PAUL KRUGMAN AND ANTHONY J. VENABLES, THE SPATIAL
ECONOMY: CITIES, REGIONS AND INTERNATIONAL TRADE 117-132, 227-236 (MIT Press 2001)

  10
are indicated in a band of grey. There is also an arrow that points from this
larger socio-cultural complex to the creation of a major language family—as
indicated in the bottom left hand corner of the diagram.) The second
phenomena are the predicted effects of more ordinary agricultural production
(i.e., absent a major river system) on the production of linguistic expansions.
(These latter effects are represented by three striped plots of land, which are
much smaller than the riverine complex described above and are somewhat
disconnected from one another—thus allowing for the maintenance of
continued linguistic diversity among these groups.) Figure 1 then uses small
human icons to represent the third class of phenomena, which are a number of
remaining hunter-gatherer bands. For reasons already discussed, these bands
should have continued to display even more extreme forms of linguistic
diversity, unless and until they either converted to one of the larger language
families in the region or were displaced.

With the rise of agriculture in a particular region, the current model thus
predicts that major river systems would have begun to play a key role in
amplifying the ordinary effects of these new subsistence patterns on the
production of major language families. These dynamics would have not only
catalyzed some of the more familiar processes described by Renfrew’s

  11
agriculturalist-expansionist model of linguistic expansion but they would have
also begun to produce some of the most important and rapidly expanding
ancient centers of political and economic prestige. As Nichols has suggested,
languages from important centers of political and economic prestige have a
tendency to spread through certain other regions, which she calls “linguistic
spread zones” as well.26 (Nichols has observed that the Eurasian steppes have
functioned as paradigmatic linguistic spread zones for all of recorded
history—which is a point that will become important in later discussions.)27
We should therefore expect that the major riverine languages under discussion
would have had some of these same tendencies to spread into any adjacent
linguistic spread zones.
Let me now focus attention on a further aspect of Figure 1, which will
become especially important in later discussions (when we try to understand
the most plausible linguistic relationship between the pastoralist Indo-Aryan
groups who are sometimes said to have written the Vedas and some of their
more settled agriculturalist neighbors). On the current model, one of the
reasons why major river systems would have contributed to early linguistic
expansions in special and unparalleled ways is the following: these
geographic topographies would have tended to produce a very specific
division of labor between certain sedentary groups (who would have tended
either to cultivate the land near the center of these major river systems or
engage in budding industrial and/or trade-related activities from ports along
these same river banks) and certain more nomadic, pastoralist groups (who
would have tended to breed and raise livestock and would have tended to live
further toward the edges of these expanding socio-cultural complexes). This
predicted division—which is a division among linguistically related groups
who exhibit distinct subsistence patterns—is depicted in Figure 1 by a grey
band of semi-nomadic pastoralist groups who are surrounding the more
settled populations at the center (near the major river system). This grey band
also has a number of arrows leading radially outward from the center, which
is meant to indicate the important role that these pastoralist groups are playing
in the rapid expansion of the major language family of which they are a part.28
The reason for this predicted division of labor is as follows. Many
herding animals—such as sheep, goats, and cattle—require large amounts of
land for grazing, and need to be periodically moved from locale to locale with
changes in the seasons and to prevent overgrazing.29 The fact that river water
would have been needed for crops would have nevertheless placed a premium
on riverfront land for harvesting, and thereby incentivized an emergent
geographical division between pastoralist and harvesting activities—with the
                                                                                                               
26
NICHOLS, supra note 11.
27
Id. at 275.
28
It should be noted that scholars of economic geography and urban development have
predicted similar divisions like these, based on the differential value and cost of transportation
of different classes of goods in relation to an urban center. See, e.g., FUJITA ET AL., THE
SPATIAL ECONOMY, supra note 25, at 16 & Figure 2.1.
29
See, e.g., Vladimir N. Basilov, Introduction, in NOMADS OF EURASIA 1, 1–5 (Vladimir N.
Basilov ed., Mary Fleming Zirin trans., 1989). “Nomadism had a marked seasonal character
because in many cases the pastures on which the livestock could live during the summer
months were not suitable for winter—and vice versa.” Id. at 1 (noting that the availability and
condition of pastures also affected the course of nomadic migrations).

  12
harvesting segments of these societies located closer to the major river
systems at the center and the pastoralist segments being pushed further toward
the periphery. Because of the special importance of boats for bulk
transportation (especially prior to the domestication of pack animals), major
trading ports would have also typically been located near riverfronts.30 For
similar reasons, many of the industrial centers that produced important
commodities for trade would have needed to remain near these same
regions.31 Together, these facts would have therefore produced even further
pressures toward the type of division of labor under discussion—with
pastoralist groups separating from a range of more settled groups who
engaged in either agricultural or increasingly urban lifestyles.
The riverine-agricultural model of prehistoric linguistic expansion
predicts that these larger (and linguistically coordinated) social complexes
would have then begun to expand even further in terms of both population and
political and economic power. As this happened, some of the more nomadic
pastoralist groups at the edges would have often been pushed even further
toward an expanding periphery—both to accommodate the increasing
population densities and settled agricultural and urban activities at the center
and to make use of expanded pastures for grazing.32 These expanding groups
of pastoralists would have presumably brought their languages with them,
and—given the enormous prestige and economic power of the civilizations
with which they were connected (by a long history of trade and partial
common descent) at the center—these pastoralist groups would have tended
to maintain social, economic and linguistic ties to these important social
centers. As the very first ancient river civilizations grew in economic power
and prestige, these larger socio-cultural complexes would have therefore
tended to produce ever expanding social and linguistic phenomena and, along
with them, some of the very first major language families in the world.
Thus far, I have described the riverine-agricultural model of prehistoric
linguistic expansion in relatively simple and straightforward terms, but it
should go without saying that real life developments are often not so neat.
The archaeological records suggests, for example, that many of the earliest
riverine civilizations have often undergone several cycles of growth (with
concomitant developments toward increased social complexity, urbanization
and the consolidation of centralized political authority and hierarchy)
followed by intervening periods decline (with concomitant developments
toward regionalization along with reduced forms of social complexity,
urbanism, and centralized political authority). During periods of expansion,
we should therefore expect some tendency of the more settled urban groups at
the center of these larger socio-cultural complexes to be able to exert
increased social and political power in relation to their more pastoralist and
tribally-based cousins. During periods of decline, we should similarly expect

                                                                                                               
30
FUJITA ET AL., THE SPATIAL ECONOMY, supra note 25, Ch. 13 (“Ports, Transportation
Hubs, and City Location”).
31
Id.
32
See, e.g., id. at 5 (“Migratory herding was not humankind’s most ancient occupation.
As archaeological excavations have shown, it was preceded by a complex livestock-raising and
agricultural economy with a relatively sedentary way of life; only husbandry had a more
pastoral character.”).

  13
to see some collapses in larger-scale forms of political authority and some
reductions in urbanism—which would tend to give these same pastoralist
groups increased political and military power relative to their heavily
weakened agriculturalist cousins. Quite often, many of these more realistic
dynamics have nevertheless involved shifting associations of people who have
spoken closely related dialects of the same language family.33 Hence, many
of these more realistic cycles can be understood as involving a more
continuous pattern of linguistic expansion.
Together, these discussions thus provide us with the basis for the
following theoretical predictions. During the early to mid parts of the
Holocene, major river systems, when combined with the development of
agriculture, should have played a key role in the production of especially large
and interdependent populations—along with the special cultural traditions
(including incipient legal traditions) that tend to make large scale social
complexity possible. These processes should have also involved some of the
first developments toward urbanism and the creation of robust urban networks
along these major river systems. Given the frequency-dependent value of
language, these special riverine topographies should have therefore—and
simultaneously—played a critical role in the earliest prehistoric expansions of
the world’s very first major language families. It should therefore come as
little surprise that two of the four largest language families in the world
(namely, Sino-Tibetan and Afro-Asiatic) originated from some of the earliest
riverine civilizations in the world (namely, from the ancient Chinese
civilizations that originally formed around the Yellow and Yangtze Rivers
and from the ancient Egyptian and Mesopotamian Civilizations, which
originally formed around the Nile, the Tigris and the Euphrates Rivers,
respectively). There is only one major language family that is larger than
both Sino-Tibetan and Afro-Asiatic, and it is the Indo-European language
family. One way to understand the central thesis of this article is to see it as
proposing a similarly intimate connection between this—the largest—major
language family and the only other comparable seat of ancient human
civilization—which, as it turns out, was located in the Indus Valley and was
the largest of them all.
So much, then, for theoretical motivation. The next point I want to make
is that this model can be empirically tested by comparing the geographic
distributions of the world’s major language families to the locations of the
world’s major river systems and checking for predicted correlations. In my
recent work, I have done just this, and what I have found provides strong
confirmation for the riverine-agricultural model of linguistic expansion.
In order to be systematic, I essentially began with a list of the thirty-six
longest rivers in the world.34 From these, I (originally) eliminated those that
flow either in or around Siberia or primarily through the Tibetan plateaus, on
the ground that these areas have typically proven particularly inhospitable to
agriculture.35 I also eliminated all rivers located in the New World (including
                                                                                                               
33
For a discussion of the many examples of this phenomenon in the world historical
record, see Section 3, infra.
34.
See WORLD ALMANAC, supra note 15, at 466–67 (listing longest rivers in the world).
35.
The following rivers were eliminated on this ground: the Yenisei-Angara, the Amur,
the Lena, the Brahmaputra (which runs largely through the Tibetan Plateaus before emptying

  14
Australia) on two grounds. First, the present distributions of languages in
these areas have been so severely altered by recent colonialism (and by
modern dynamics of linguistic replacement that could not have occurred
before the recent industrial and colonial revolutions) that they tell us very
little about the prehistoric dynamics in these regions. Second, these same
colonial events have made it extremely difficult to reconstruct the relevant
pre-colonial patterns with enough confidence to shed sufficient light on the
current proposal.36 (I will nevertheless discuss both the New World and the
Siberian rivers further below.) This left me with the following relatively short
list of rivers (set forth in order of decreasing length): the Nile, the Yangtze,
the Ob, the Yellow, the Irtysh, the Congo, the Niger, the Mekong, the Volga,
the Salween, the Indus, the Danube, the Euphrates, the Zambezi, the Ganges,
and the Dnieper. To this list, I added another monsoon-based river, which we
now know to have flowed through the northwestern regions of the Indian
Subcontinent during the entirety of the life of the Indus Valley Civilization
but is no longer in existence.37 For reasons I will explain below, I associate
this monsoon-fed river with the Vedic “Sarasvati,” and I have therefore
labeled it as such. These major rivers and their locations are depicted in
Figure 2.
With regard to the major language families, I then started with the
eleven largest language families currently in existence. From this list, I
originally eliminated the Altaic language family (which includes Turkic and
Mongolian) on the ground that its primary expansions occurred not during
prehistoric times but rather in a series of waves between the fifth and fifteenth
centuries AD, at a time when very different methods of expansion (by
conquest and migration) had become available.38 (I will, however, return to

                                                                                                                                                                                                                                                                       
into the ocean near the Ganges delta area), and the Ural (which runs largely through the
western regions of Siberia). See WORLD ALMANAC, supra note 15, at 466 (noting longest rivers
in the world along with their locations and outflow). I should also note that the Oxus is listed
as just shorter than the Dnieper on some lists and just longer on others, whereas the Jaxartes
would have made this list in prior decades but has been shrinking so that it is presently listed as
just shorter than the Dnieper on most recent surveys.
36
The following rivers were eliminated on this ground: the Amazon, the Mississippi-
Missouri, the Rio de la Plata-Paraná, the Mackenzie, the Murray Darling (in Australia), the
Madeira, the São Francisco, the Yukon, the Rio Grande, the Purus, the Tocantis-Para, the
Saskatchewan, the Colorado, and the Arkansas. See WORLD ALMANAC, supra note 15, at 466–
67 (listing these American and Australian rivers as among the thirty-six longest).
37
Liviu Giosan et al., Fluvial Landscapes of the Harappan Civilization, PROCEEDINGS
OF THE NATIONAL ACADEMY OF SCIENCES, EARLY EDITION, available at
www.pnas.org/cgi/doi/10.1073/pnas.1112743109.
38
For a thorough description of the Turkic and Mongolian invasions that created these
expansions of Altaic languages, see generally RENÉ GROUSSET, THE EMPIRE OF THE STEPPES: A
HISTORY OF CENTRAL ASIA xxviii (Naomi Walford trans., 2002) (1970). Colin Renfrew has
suggested that “the creation of an entire spread zone through elite dominance is rare,” but he
mentions that “the most prominent case is that of the Altaic language family.” Renfrew, supra
note 20, at 7. It is also worth noting that, while Siberia is not an area that is particularly
hospitable to large-scale agricultural production, modern Altaic speakers in Siberia are still
clustered around the major riverine valleys of the region. See ETHNOLOGUE, supra note 12, at
824–25 (showing the “Map of Western Asian Russian Federation,” which shows a pattern of
Altaic speakers clustered primarily around the major rivers running through Siberia); see also
id. at 826–27 (showing the “Map of Eastern Asian Russian Federation,” which shows similar
pattern of Altaic speakers).

  15
the Altaic language family a bit later on.) I was therefore left with the
following list of major language families: Indo-European, Sino-Tibetan, Afro-
Asiatic, Niger-Congo, Austronesian, Dravidian, Austro-Asiatic, Tai-Kadai,
Uralic, and Nilo-Saharan.

Next, I engaged in a comprehensive comparison of these two data sets,


and what I found is quite striking. If we bracket the Indo-European language
family for a moment, then every one of the major language families from this
list can be correlated with either a major river system from our list or the
functional analogue of a major river system.39 Moreover, every one of our
major river systems—with only two exceptions, which will be discussed
momentarily—can be associated with the early expansions of one of the major
language families from our list. I have collected a series of maps that
establish these correlations in Appendix A.
The only two exceptions are the Danube-Dnieper river system and the
Indus-Sarasvati-Ganges river system. If, however, we bring the Indo-
European language family back into the equation, then we will see that these
                                                                                                               
39
I count two topographies as functional analogues of major river systems: either (1) long
chains of tropical islands, which allow for robust agricultural activity and are separated by
relatively short expanses of sea that therefore function like a major river system (as in the case
of Oceania); or (2) tropical regions that support robust agricultural activity and are dominated
by an interconnected chain of large rivers that do not quite reach the length of major rivers on
their own but easily do so together (as in the case of Southern India).

  16
last two remaining major river systems from our list have, in fact, played
critical roles in the early expansions of this one remaining major language
family. In particular, we know that—from at least approximately 1500 BC
and 300 BC–the Danube played an important role in some of the most
dramatic recorded expansions of the Celtic branch of the Indo-European
language family, and—in the period leading up to about the 4th century AD
(and then continuing through at least the 7thcentury AD)—the Dnieper
similarly played an important role in some of the most dramatic early
expansions of the Slavic branch. Since at least about 1500 BC, it is,
moreover, uncontroversial that the Indus-Ganges river system played an
important role in the expansion of the Indo-Iranian branch of this family,
whereas the Sarasvati was no longer a prominent river during this time period.
The correlations between our two data sets are therefore perfect: every
major language family from our list can be associated with early expansions
around at least one of the major river systems from our list or a functional
analogue, and every major river system from our list has apparently played an
important role in the early expansions of one of these major language
families. These facts provide strong empirical evidence for the riverine-
agricultural model of prehistoric linguistic expansion, and Figure 3 illustrates
the relevant correlations.
I would now like to extend my prior work to include the Altaic language
family, which is the one major language family that I originally excluded
from the analysis. As it turns out, this language family appears to have first
expanded around the handful of major Siberian rivers that I originally
excluded from the analysis as well due to their compromised capacities to
sustain agriculture: the Yenisei, the Lena, the Amur and the Vilhuy rivers.40
Although I believe there were plausible reasons for these two exclusions,
reintroducing these phenomena would therefore yield yet another correlation
that supports the riverine-agricultural model of prehistoric linguistic
expansion. (It should be noted that there is nothing about the riverine-
agricultural model of prehistoric linguistic expansion that would prevent other
dynamics from becoming more important for linguistic expansions in later
historical periods—as clearly occurred with the Altaic language family.)
In the next section, I will build upon the riverine-agricultural model of
linguistic expansion to argue that the Indus Valley Civilization almost
certainly spoke dialects of Proto-Indo-European. Before turning to that
argument, I would, however, like to make an important observation about
Indo-European language family itself: it is the only major language family in
Figure 3 that is split between two major Old World river systems that are
separated by immense geographic distances (i.e., of over 2,000 miles). To get
from the Indus Valley Valley to the lower Danube and Dnieper rivers (or the
other way around) during the relevant periods of human prehistory, one would
have had to take one of two basic routes. Either one would have had to take a
                                                                                                               
40
See, e.g., ETHNOLOGUE, supra note 12, at 826-27 (Map of Eastern Asian Russian
Federation, showing predominance of Altaic languages around the Lena and Vilhuy rivers,
along with a number of other smaller rivers in the region); id. at 824-25 (Map of Western Asian
Russian Federation, showing predominance of Altaic languages in the northeastern regions of
Siberia beginning just east of the Yenisei, and in regions surrounding the Kotuy, Kheta,
Nizhnyaya Tunguska, Olenek, Markha, Anga, Aldan, Yana, and Indigirka rivers).

  17
northern route, which proceeds through those portions of the Eurasian Steppes
that directly connect ancient Bactria to the relevant parts of eastern Europe.
Or one would have needed to take a more southern route, which connects the
eastern parts of Iran to the Balkans by way of western and central Iran and
Anatolia. Hence, it is by one or the other (or both) of these routes that Indo-
European languages must have spread between these two very distant riverine
regions.

We also know that both riverine regions began to play important roles in
the early expansions of some branches of the Indo-European language family,
and the riverine-agricultural model of expansion predicts that one of these two
regions would have almost certainly played a key role in the earliest
expansions of Proto-Indo-European. Before turning to our affirmative
arguments, we might therefore pause for a moment to ask which of these two
major riverine topographies would have more plausibly played the earlier role
of the two in helping to produce the largest major language family in the
world.

  18
The archaeological record suggests that in the period leading up to 3500
BC, early agricultural settlements had begun to develop in both the Indus-
Sarasvati and the Danube-Dnieper regions. 41 Both settlements were also
beginning to show some signs of increased social complexity,42 and we should
therefore expect that some linguistic expansions were beginning to take place
in both regions. By 3500 BC, however, these two regions had diverged quite
sharply. The groups in the Indus Valley Valley had just undergone about a
millennium of continuous and unbroken development and were beginning to
enter into their first phase of incipient urbanism.43 These developments would
then continue for another millennium before leading (also continuously) into
the mature urban phase of the Indus Valley Civilization. At this point, the
Indus Valley Civilization began to transform into one of the very first and
only large-scale urban civilizations ever to emerge within our natural history
as a species, and it was by far one of the most extensive sociocultural
phenomena anywhere in the world.44 Its mature period would then last from
roughly 2600 BC until roughly 1900 BC,45 and there are even some signs of
this Civilization’s material culture lasting as late as 1000 BC in some parts of
the Indian subcontinent.46
Beginning in around 2300 BC, the archaeological record also shows the
emergence of another complex urban (or at least proto-urban) civilization in
ancient Bactria: the Bactria-Margiana Archaeological Complex (or BMAC
Civilization).47 This civilization was centered along the Oxus river (which is
another major river in the region that begins very near the upper portions of
the Indus River system), and there is extensive evidence of close links
between the BMAC and Indus Valley Civilizations. 48 The riverine-
agricultural model of linguistic expansion thus predicts that—by 1900 BC—
the languages of the Indus Valley Civilization would have been part of an
extraordinarily large and continuous set of linguistic developments with close
links to ancient Bactria and a very ancient history in the larger region. These
linguistic developments would have also represented one of the two most
important and expansive linguistic developments ever to have arisen within
our natural history as a species (the other being the Afro-Asiatic languages
                                                                                                               
41
See ALLCHIN & ALLCHIN, supra note 3, at 113-83 (describing early urban period of the
Indus Valley Civilization as beginning around 4500 BC); David W. Anthony, The Rise and
Fall of Old Europe, in THE LOST WORLD OF OLD EUROPE: THE DANUBE VALLEY, 5000–3500
BC, at 29, 35–54 (David W. Anthony & Jennifer Y. Chi eds., 2010) (describing early
developments of Old Europe).
42
Id.
43
These descriptions of the Indus Valley Civilization’s archaeological record are based in
ALLCHIN & ALLCHIN, supra note 3, at 113–83 (describing three periods: period of agricultural
expansion into Indus Valley beginning in 4500 BC; period of Early Harappan Incipient
Urbanism (3500 BC–2600 BC); and mature Harappan period (2600 BC–1900 BC). See also id.
at 183–205 (describing the people and culture of the Harappan period).
44
ALLCHIN & ALLCHIN, supra note 3, at 153–54; JANE R. MCINTOSH, THE ANCIENT INDUS
VALLEY: NEW PERSPECTIVES 4 (2008) (“In the third millennium [the Harappan] civilization
flourished over an area far larger than those of its contemporaries in Mesopotamia and
Egypt.”).
45
ALLCHIN & ALLCHIN, supra note 3, at 153–82 (describing the Mature Harappan Period).
46
Giosan et al., supra note 37.
47
GREGORY POSSEHL, THE INDUS CIVILIZATION: A CONTEMPORARY PERSPECTIVE 231-32
(Oxford University Press, 2002) (describing the BMAC Civilization).
48
Id.

  19
that had been emerging from ancient Egypt and Mesopotamia, since Chinese
urban civilization began only a bit later).
The contemporaneous developments around the Danube and Dnieper
rivers, on the other hand, provide a stark contrast. In these regions, there is
extensive archaeological evidence to suggest that—whatever their earlier
history—the earliest complex settlements in these regions (which are often
referred to as “Old Europe”) were all destroyed and burned to the ground in a
rather spectacular fashion by about 3500 BC.49 Several centuries then passed
before there is evidence of any substantial resettlement in these regions.50
Beginning in about 3300 BC, certain pastoralist groups from the steppes then
began to resettle in these regions (in relatively small numbers at first) and
move up the Danube River, where they eventually formed the basis for the
Celtic-speaking branch of Indo-European.51 It would nevertheless be at least
another two millennia (and arguably four—i.e, until after the end of the
European Dark Ages) before these regions would begin to exhibit anything
like the level of social complexity and urbanism exhibited by the Indus Valley
Civilization.52
Sometime after 3300 BC, other nomadic groups from the steppes then
began to settle along the Dnieper, where they formed the basis for the Balto-

                                                                                                               
49
For a good discussion of the developments of this culture, and the ensuing
archaeological record of its destruction, see David W. Anthony, The Rise and Fall of Old
Europe, in THE LOST WORLD OF OLD EUROPE: THE DANUBE VALLEY, 5000–3500 BC, at 29, 35–
54 (David W. Anthony & Jennifer Y. Chi eds., 2010).
50
Id.
51
Id.
52
See BAROCH, supra note 19, at 86-89 (suggesting that the roots of urbanization had
spread from Rome into the interior of Western Europe only in about 400 BC).

  20
Slavic branch of the Indo-European family.53 Once again, however, this
region did not show anything like the kind of emergent social complexity that
we see in the Sindhu-Sarasvati Valley until at least the rise of the Kiev
Empire in the 9th century AD.54 In the period leading up to about 1900 BC,
there would have thus been no real comparison between these two riverine
regions. The Indus-Sarasvati river system should have clearly been producing
a much more important and monumental linguistic phenomenon—and one
with a much greater likelihood to emerge as the largest language family in the
world during the historical period. Figure 3-B depicts these comparisons
between these two major riverine topographies.
Figure 3-B still remains officially agnostic as to where to place the
earliest expansions of Proto-Indo-European dialects. It nevertheless identifies
the major languages of the Indus Valley Civilization as preceding the well-
attested Indo-Iranian dialects of northern India before about 1500 BC, and it
identifies the more minor languages of Old Europe (which are most plausibly
extinct now) as preceding the Celtic and then Slavic languages that eventually
began to dominate along the Danube and Dnieper rivers. So what should we
make of all this? If we had to choose a plausible riverine source for the
earliest expansions of the largest language family in the world from among
these two major river systems (and the riverine-agricultural model of
linguistic expansion suggests that we probably must), then I think the
contrasting archaeological records from these two regions suggest that the
Indus Valley Civilization is the more plausible candidate. If, moreover, the
Indus Valley Civilization spoke dialects of Proto-Indo-European, then the
contrasting archaeological records from these two regions might help explain
why there are currently more than six times the number of native Indo-Aryan
speakers on the Indian subcontinent alone than there are native Celtic and
Slavic speakers combined anywhere in the world.55 (This contrast will remain
if we compare Indo-Iranian speakers to Balto-Slavic and Celtic speakers as
well—as shown in Figure 3-B).56
If, on the other hand, one were to propose that the Indo-European
languages first began to expand around the Danube-Dnieper river system and
were only later introduced to the Indian subcontinent beginning in around
1500 BC, then we should probably wonder why this single and final branch of
the Indo-European family would have produced 310 different languages with
approximately 1.7 billion native speakers in such a short time, when the Celtic
and Balto-Slavic branches combined were only able to produce 28 different

                                                                                                               
53
ANTHONY, supra note 22, at 367-68 (identifying the Corded Ware Horizon, which
begins in about 2900 B.C., as the “archaeological manifestation of the cultures that introduced
the northern Indo-European languages to Europe: Germanic, Baltic, and Slavic”).
54
See BAROCH, supra note 19, at 89 (“[I]n Russia, . . . the onset of urbanization only took
place after the eighth century A.D. One should note that for Kiev, which can be considered as
the first Russian town, recent research indicates an earlier date of birth. Kiev emerged as a city
somewhere between the fifth and sixth centuries.”); id. at 89-91 (discussing the more general
late urbanization of non-Romanized Europe).
55
These figures were calculated by adding up the number of native speakers listed in
Ethnologue in each of the relevant countries where these language families exist. See
ETHNOLOGUE, supra note 12.
56
Id.

  21
languages and about 280 million native speakers.57 (It should be remembered
that on the more traditional view, the Celtic and Slavic branches would have
also separated and begun to expand around their respective riverine regions
much earlier than the Indo-Iranian branch within the Indian subcontinent—
with the Celtic groups beginning in around 3300 BC, and the Indo-Aryan ones
only around 1500 BC.) I would not, however, want to rest too heavily on
plausibility considerations like these, and so the next section will begin to
develop a much stronger and more affirmative set of arguments for these
claims.

2. The Affirmative Argument for a Proto-Indo-European Speaking


Indus Valley Civilization

In this section, I will now present a more affirmative argument, based


in the riverine-agricultural model of linguistic expansion, for the claim that
the Indus Valley Civilization spoke dialects of Proto-Indo-European. As
already noted, the archaeological record establishes that in the period leading
up to about 1900 BC, the Indus Valley region produced one of the very first
and most significant large-scale civilizations ever to have existed within our
natural history as a species.58 This civilization developed around one of the
most important major river systems of its time. 59 Hence, the riverine-
agricultural model of linguistic expansion predicts that the languages spoken
by the Indus Valley Civilization would have been part of one of the most
monumental and influential linguistic phenomena ever to have existed within
our natural history as a species.
Notice that this first prediction is strictly one about the relative size and
importance of the language family in question, and not yet a claim about what
language family it was. So construed, this first prediction is further supported
by the fact that the Indus river, which is a glacier-fed Himalayan river, is itself
fed by five other important glacier-fed rivers that also emerge from the
Himalayas: the Jhelum, the Chenab, the Ravi, the Beas, and the Sutlej. At one
time, there was apparently another glacier-fed river that flowed through the
now dry Ghaggar-Hakra river bed (between the Sutlej and the Yamuna) but
this river was diverted from its course by tectonic shifts long before the Indus
Valley Civilization arose.60 A recent 2012 study has nevertheless established

                                                                                                               
57
These figures were calculated by counting the individual languages that the web-based
version of Ethnologue lists as parts of these different branches of the Indo-European language
family, and then adding up the number of native speakers that Ethnologue associates with each
of these languages. See Ethnologue, Web Version, Language Family Trees, Indo-European, at
http://www.ethnologue.com/show_family.asp?subid=2-16.
58
See ENCYCLOPEDIA OF DESERTS 293 (Michael A. Mares ed., 1999) (“The Indus and
Punjnad River valleys are fertile valleys within an arid region. The Indus Valley served as the
site of one of the earliest human civilizations, the Indus Civilization, or Harappan . . . .”); Cf.
GLYN DANIEL, THE FIRST CIVILIZATIONS: THE ARCHAEOLOGY OF THEIR ORIGINS 93–118 (1968)
(discussing the history and rise of civilization in the Indus Valley).
59
See WORLD ALMANAC, supra note 15, at 466–67 (listing major rivers of the world and
indicating that the Indus River is tied for twenty-fourth longest river in the world).
60
Giosan et al., supra note 37.

  22
that—during the entire period of the Indus Valley Civilization—there was a
distinct and important monsoon-fed river that flowed just to the east of (and
roughly parallel to) its six glacier-fed sisters to the west.61 This river also
flowed through the upper Ghaggar-Hakra riverbed, but it was less prone to
disruptive flooding than its six glacier-fed sisters.62 It therefore provided a
particularly reliable basis for robust agricultural production during this period,
and the greatest density of Indus Valley Civilization sites have been found
along this seventh river.63 Figure 4 below depicts these seven rivers, and also
identifies the locations where the greatest density of Indus Valley sites have
been found in dark grey.
Because this seventh river corresponds precisely to the detailed and
comprehensive description of the Vedic “Sarasvati” that Ashok Aklujkar has
recently derived (independently) from the ancient Vedic texts,64 I believe we
should associate this seventh river with the Vedic “Sarasvati.” The
archaeological record suggests that this river lay at the epicenter of the Indus
Valley Civilization and was as central to it as it was to early Vedic culture.65
Together, these seven major rivers therefore plausibly constitute the famous
“Sapta Sindhu” of the Vedic texts as well.66 In any event, the confluence of
these seven major rivers lend further support my first prediction—which is
that the language or languages of the Indus Valley Civilization would have
reflected a linguistic phenomenon of extraordinary proportions for its time.
Although we are primarily concerned with the Indus Valley
Civilization in this article, Figure 4 also shows that this region is situated very
near four other major rivers. The first two—the Oxus (or Amu Darya) and the
Jaxartes (or Syr Darya)—flow from the Hindu Kush and Tien Shan
mountains, respectively, toward the Aral Sea.67 As noted before, we know
that—from approximately 2300 BC until approximately 1700 BC—the
BMAC Civilization emerged in ancient Bactria, and these developments are
therefore also depicted in Figure 4. The third river—the Helmland—flows
from the Hindu Kush into eastern Iran, and the fourth—the Murghab—flows
just a bit to the northwest of the Helmland. Especially at their upper regions,
these four additional rivers come very close to the upper parts of the Indus,
                                                                                                               
61
Id.
62
Id.
63
Id.
64
See Ashok Aklujkar, Sarasvati Drowned: Rescuing Her from Scholarly Whirlpools, in
THE SINDHU-SARASVATI CIVILIZATION: NEW PERSPECTIVES (2012).
65
Giosan et al., supra note 37.
66
See PARMESHWARANAND (SWAMI), ENCYCLOPEADIC DICTIONARY OF VEDIC TERMS vol.
1, at 584-85 (2000) (describing the use of the phrase “sapta sindhu” in the Vedas, and
suggesting that “modern interpreters . . . agree as far as six rivers viz. the five rivers of the
Punjab . . . and the Indus are concerned but differ on the seventh.”). I am suggesting that this
monsoon-based river plausibly constituted the seventh.
67
During much of our prehistory, these two rivers would have been long enough to make
our list of the top thirty-six, but the Jaxartes has been shrinking over the last few centuries, due
to aggressive modern irrigation practices, and it no longer consistently flows through the entire
river bank illustrated in Figure 4. The Oxus, on the other hand, is still long enough to just make
some lists, depending on how one measures the length of the relevant rivers. There is
obviously some arbitrariness to the ways in which river lengths are measured, especially when
they have a number of tributaries, but there is no doubt that the Oxus has been a very important
river in these regions during all of the relevant time periods.

  23
and there is extensive evidence of interconnections between the many groups
who have settled along these four river systems and those in the Indus
Valley.68

Given these facts, the riverine-agricultural model of linguistic expansion


predicts that this larger region—which I have called the “Eastern-Iran-
Bactria-Indus-Valley” region—would have most plausibly been generating a
highly coordinated set of dialects of a single language family prior to about
1900 BC. This language family would have also been one of the most
expansive and important linguistic phenomena anywhere in the world by far.
Even if this language family did not extend quite as far as Bactria and Eastern

                                                                                                               
68
GREGORY POSSEHL, THE INDUS CIVILIZATION: A CONTEMPORARY PERSPECTIVE 231-32
(Oxford University Press, 2002) (describing archaeological evidence of interactions between
the BMAC Civilization and the Indus Valley Civilization”); ALLCHIN & ALLCHIN, supra note 3,
at 137 (describing long-term relations between regions like Mehrgahr and the Indus Valley
Civilization).

  24
Iran, it is clear that a major language family would have been emerging from
the Indus Valley region itself.
Turning to logic, this language family was either Proto-Indo-European
or it was not. Let us assume the second possibility first, to see where it leads
us. If the language family in question was something other than Proto-Indo-
European, then we must assume that Indo-European speaking groups
displaced this pre-existing language family sometime after the demise of the
Indus Valley Civilization—viz., sometime after 1900 BC. This is, in fact, the
standard assumption. 69 Given the arguments developed in this article,
however, this view could only be true if a relatively small group of Indo-
European nomads were capable of displacing a linguistic phenomenon with
incredible temporal and geographic reach. Although the historical record
provides us with a number of known examples of nomadic groups invading
and conquering indigenous agricultural populations, they have never done so
while completely eradicating a linguistic phenomenon of this scope.70 Indeed,

                                                                                                               
69
See, e.g., THE ARYAN DEBATE at xiii (Thomas R. Trautmann ed., 2005) (“The first
position, the immigrant Aryan position that the Aryans came to India from outside in about
1500 BC, I will call the standard view because it is the interpretation that has prevailed in
school and university history textbooks and in academic journals and books.”); J.P. MALLORY
& D.Q. ADAMS, THE OXFORD INTRODUCTION TO PROTO-INDO-EUROPEAN AND THE PROTO-INDO-
EUROPEAN WORLD 443 (2006).
70
The most successful and notorious set of invasions of agricultural populations by
nomadic groups are those that were led by Altaic-speaking groups, which include the Turks, the
Huns, and the Mongols. See GROUSSET, supra note 38, at vii–xi, xxiii–xxiv (“It is certain,
however, that from the beginning of the Christian era the flow was from east to west. It was no
longer the Indo-European dialects that prevailed—‘East Iranian,’ Kuchean, or Tokharian—in
the oases of the future Chinese Turkestan; it was rather the Hsiung-nu who, under the name of
Huns, came to establish a proto-Turkic empire in southern Russia and in Hungary. (The
Hungarian steppe is a continuation of the Russian steppe, as the Russian steppe is of the Asian.)
After the Huns came the Avars, a Mongol horde which had fled from Central Asia under
pressure from the T’u-chueh in the sixth century, and which was to dominate the same regions,
first Russia and later Hungary. In the seventh century came the Khazar Turks, in the eleventh
the Petcheneg Turks, and in the twelfth the Cuman Turks, all following the same trail. Lastly,
in the thirteenth century, the Mongols of Jenghiz Khan integrated the steppe, so to speak, and
became the steppe incarnate, from Peking to Kiev.”).
Even in those areas where these conquering groups were successful, however—which
include many regions of modern-day Turkey, the Middle East, Pakistan, Afghanistan,
Northwestern India, and parts of Eastern Europe—the pre-existing (non-Altaic) language
families still persist as dominant languages in the regions. See ETHNOLOGUE, supra note 12, at
774 (showing Afghanistan as dominated by Iranian and some Indo-Aryan dialects, with only a
small pocket of Altaic dialects in the extreme northern regions); id. at 785 (showing Iraq as
dominated by Arabic languages, along with one Indo-European language (Kurdish)); id. at
816–17 (showing Pakistan as dominated by Indo-Aryan and Iranian languages, with no
significant pockets of Altaic-speaking communities); id. at 824–27 (showing Western Asian
Russian Federation states as containing pockets of Indo-European speaking groups, especially
in the western and southern regions); id. at 848–49 (showing European Russian Federation
States as dominated by Indo-European speakers, with only small pockets of Altaic-speaking
groups); id. at 832 (showing Turkmenistan and Uzbekistan as containing pockets of Indo-
European speakers); id. at 805 (showing Tajikistan as dominated by Indo-European languages,
with only small pockets of Altaic speaking groups); id. at 847 (showing Ukraine as dominated
by Slavic language speakers); id. at 533–35 (listing languages spoken in Turkey, which include
Turkish (an Altaic language) as the national language, but also a number of remaining pockets
of Arabic and Indo-European speaking groups); id. at 452–57 (listing languages spoken in Iran,
which are predominantly Indo-European). See infra Figure 5 (showing Northwestern India as
dominated by Indo-European languages).

  25
prior to the age of modern colonialism, the more typical pattern is that the pre-
existing languages have persisted as the dominant languages of the people in
these conquered regions.71
To get a rough sense of just how difficult it would be to displace a major
language family in these particular regions, one might consider the many
difficulties that the U.S. military—with all of its advanced military and
technological equipment—has faced in even penetrating some of the
mountainous regions of Afghanistan. 72 One should also remember that
various nomadic Altaic groups (such as the Turks and Mongols) have invaded
and conquered some of these regions during historical times (and with much
superior technology than would have been available in 1500 BC), and yet
Altaic languages have only replaced pre-existing Indo-European dialects in a
few highly isolated parts of the Indian subcontinent.73 Indeed, Indo-European
(and, more specifically, Indo-Iranian) dialects still predominate throughout
the Eastern-Iran-Bactria-Indus-Valley region, despite these regions having
been conquered several times by Altaic or other external forces.74 Hence, I
submit that a minimal criterion for the plausibility of any theory that posits a
non-Proto-Indo-European language family for the Indus Valley Civilization is
that some significant pockets of this language family should still remain in the
northwestern regions of the Indian subcontinent. I will call this the
“significant remaining pocket” criterion (or “significant pocket” criterion for
short).
Let us assume that the significant pocket criterion is valid for a moment,
and see what it would mean for the language (or languages) of the Indus
Valley Civilization. In the next section, I will then return to the significant
pocket criterion to address any lingering concerns about its validity.

                                                                                                               
71
See GROUSSET, supra note 38, at xxix (“But there was another, opposing law, which
brought about the slow absorption of the nomad invaders by ancient civilized lands. This
phenomenon was twofold in character. First, there was the demographic aspect. Established as
a widely dispersed aristocracy, the barbarian horsemen became submerged in these dense
populations, these immemorial anthills. Second, there was the cultural aspect. The
civilizations of China and Persia, though conquered, in turn vanquished their wild and savage
victors, intoxicating them, lulling them to sleep, and annihilating them. Often, only fifty years
after a conquest, life went on as if nothing had happened. The Sinicized or Iranized barbarian
was the first to stand guard over civilization against fresh onslaughts from barbarian lands.”).
72
See, e.g., C.J. Chivers, Vantage Point: The Challenges of Small-Unit Patrolling in
Afghanistan, N.Y. TIMES (Jan. 13, 2011, 9:47 AM),
http://atwar.blogs.nytimes.com/2011/01/13/vantage-point-the-challenges-of-small-unit-
patrolling-in-afghanistan (“Whatever one thinks of counterinsurgency theory, not many people
would dispute that home terrain favors a local insurgent force.”). Cf. Robert D. Kaplan,
Actually, It’s Mountains, FOREIGN POL’Y, July/Aug. 2010, at 105 (discussing the relationship
between the presence of geographic barriers and failed states).
73
See supra note 324.
74
See ETHNOLOGUE, supra note 12, at 365–407 (listing languages spoken in India);
THE ATLAS OF LANGUAGES at 59 (Bernard Comrie et al. eds., rev. ed. 2010) (displaying map
that notes languages spoken in India); see also EDWIN BRYANT, THE QUEST FOR THE ORIGINS OF
VEDIC CULTURE: THE INDO-ARYAN MIGRATION DEBATE at 237 (2001) (“India was invaded nine
times in one millennium by Achemenides, Macedonians, Bactrians, Greeks, Sākas, Kusāns,
Sassanides, Yuezi, and Hephtalite Huns . . . [and also by the] Turks, Mongols, Afghans,
Portuguese, French, and British . . . [yet] none of these groups eradicated the preexisting
languages on the subcontinent as the Indo-Aryans are assumed to have done.”).

  26
a. Ruling Out All Language Families Except Dravidian, Munda,
and Indo-European

The significant pocket criterion can be used to eliminate a wide range


of possible language families from the start. For example, we can eliminate,
with a very high degree of confidence, any contention that the Indus Valley
Civilization spoke a language that is part of some unknown or extinct
language family.75 This group includes all of the extinct ancient isolates from
neighboring regions, such as Sumerian (which was once spoken in lower
Mesopotamia) and Elamite (which was once spoken in the western regions of
modern day Iran along the eastern shores of the Persian Gulf).76 We can also
eliminate all of the linguistic substrates that have been identified in the
ancient Vedic texts but that no longer survive as spoken languages in their
own right. These would include Masica’s “language X,”77 as well as Michael
Witzel’s proposal that the Indus Valley Civilization spoke an extinct language
that he has sometimes called “Para-Munda” (due to its many resemblances
with Proto-Munda).78 The significant pocket criterion can, finally, be used to
eliminate a number of more minor linguistic phenomena—such as Burushaski
and Tibeto-Burmese—which are still found in or near the relevant parts of the
Indian subcontinent but not in significant enough numbers to meet our
criterion.
As a first cut, we must therefore focus on the small handful of language
families that show up in significant amounts somewhere within the Indian
subcontinent. Apart from the Indo-European language family (which we are
bracketing for the time being by assumption), this leaves only two real
possibilities: Dravidian, which currently dominates in South India, and
Munda, which is part of the western branch of the Austro-Asiatic language
family and appears primarily in some northeastern parts of India. Let us
consider each of these possibilities in turn. For ease of reference, Figure 5
depicts the Indian subcontinent with its contemporary patterns of linguistic
diversity and some of its most relevant geographic features.

                                                                                                               
75
ALLCHIN & ALLCHIN, supra note 3, at 185 (“We can leave aside postulation of an
unknown language, since there is no reason to believe that so major a language would have
disappeared without trace.”).
76
See id. at 184–85 (noting a second reason to eliminate Sumerian and Elamite in
particular: the speakers of both of these languages had developed scripts prior to the Harappan
script, and it therefore “seems reasonable to assume that had any one of these been introduced
to the Indus they would have brought with them their own already existing scripts.”).
t77 See Colin P. Masica, Aryan and Non-Aryan elements in North Indian agriculture, in
ARYAN AND NON-ARYAN IN INDIA 55-151 (Deshpande, M. M., and P. E. Hook, eds.) (1979)
(finding evidence of a non-Aryan substrate language in Sanskrit).
78
Michael Witzel, Central Asian Roots and Acculturation in South Asia: Linguistic and
Archaeological Evidence from Western Central Asia, the Hindukush and Northwestern South
Asia for Early Indo-Aryan Languages and Religion, in LINGUISTICS, ARCHAEOLOGY AND THE
HUMAN PAST 175-180 (ed. OSADA TOSHIKI 2005).

  27
b. Ruling Out the Dravidian Hypothesis

Within the secondary literature, the Dravidian hypothesis (or the


hypothesis that the Indus Valley Civilization spoke dialects of proto-
Dravidian) has for many years “been the most frequently and strongly
supported hypothesis since its adoption by Marshall . . . and Hunter” in the
1930s.79 On its face, the Dravidian hypothesis might also seem to have quite
a bit of plausibility to it. As Figure 5 shows, Dravidian speakers—which are
indicated in light grey—currently dominate a large portion of the Indian
subcontinent, which is almost adjacent to the Indus Valley Valley region, but
extends primarily through the southern end of the Indian Peninsula. Because
these Dravidian speakers are located in the South, they cannot on their own
render the Dravidian hypothesis consistent with the significant pocket
                                                                                                               
79
Id. at 185; see generally G. R. HUNTER, THE SCRIPT OF THE HARAPPA AND MOHENJO
DARO AND ITS CONNECTION WITH OTHER SCRIPTS (1934); SIR JOHN MARSHALL, MOHENJO-DARO
AND THE INDUS CIVILIZATION (photo. reprint 1996) (1931).

  28
criterion. One might nevertheless imagine this pattern to be the plausible
result of Indo-European invasions from the northwest, which would have
forced any indigenous Dravidian populations to migrate to the southeast
beginning in around 1500 BC.
There is, moreover, an important pocket of Dravidian speakers who
do live in the northwestern regions of the Indian subcontinent. These are the
Brahui, who live in modern-day Baluchistan, and who are shown in Figure 5
as a small pocket of light grey near the top left corner of the diagram. At first
glance, the existence of the Brahui might therefore seem to render the
Dravidian hypothesis consistent with the significant pocket criterion.80
Upon closer examination, however, the assumptions needed to render
the Dravidian hypothesis plausible are not at all tenable. As shown in Figure
5, the northwestern parts of the Indian subcontinent are separated from the
southern peninsula by significant geographical barriers, which include the
Thar Desert, the Vindhya Mountains, and the Deccan Plateau. These
geographic facts render it highly implausible that the indigenous people of the
Indus Valley would have moved southeast (through a vast desert, and then
over a formidable mountain range) rather than east (along the Gangetic Plain)
when facing any invading populations. In any event, the archaeological
record shows no evidence of any significant migrations to the southeast
during this time period, and no such migrations are remembered as part of the
local history or oral traditions in the south.81 By contrast, there is significant
evidence of indigenous migrations toward the Gangetic plain (i.e., toward the
northeast) during this time period, and then of further expansions eastward
along the Ganges River continuing through and then well beyond 1500 BC.82
But none of these areas currently has any significant pockets of indigenous
Dravidian speakers, and Indo-European speakers instead dominate all of these
                                                                                                               
80
Indeed, the existence of the Brahui is one of the most commonly cited facts in favor of
the Dravidian hypothesis. See, e.g., Michel Danino, A Dravido-Harappan Connection? The
Issue of Methodology 8–9 (Feb. 15-16, 2007) (paper presented at International Symposium on
Indus Civilization and Tamil Language, University of Madras, Chennai), available at
http://www.omilosmeleton.gr/pdf/en/indology/A_Dravido-Harappan_connection.pdf.
81
Id. at 10 (“There is no archaeological evidence of a southward migration through the
Deccan after the end of the urban phase of the Indus-Sarasvati civilization.”); id. (“Migration
apart, there is a complete absence of Harappan artefacts and features south of the Vindhyas: no
Harappan designs on pottery, no Harappan seals, crafts and ornaments, no trace of Harappan
urbanism . . . , no civic organization, no extensive bronze technology, no set of weights, etc.”);
id. (“The Sangam literature is completely silent on a large-scale migration from the North-
West, and of course on a clash with invading Aryans.”); see also id. at 1 (“Further, the absence
of any Harappan artefacts and features south of the Vindhyas as well as recent findings on the
Central Indian origin of Brahui, on the beginnings of Indian agriculture, on anthropology and
genetics, together make it very unlikely that Harappans could have migrated to South India
after the end of the urban phase, reverting from an advanced Bronze Age culture to a Neolithic
one, forgetting all their typical crafts and sophisticated techniques, pottery designs, ornaments,
and urbanism.”); id. at 11 (“Without bringing in other strong circumstantial evidence such as
that of the Sarasvati river, we can see that several disciplines—archaeology, anthropology,
genetics and tradition—agree on the impossibility of a Late Harappan southward ‘Long
March.’”).
82
Id. at 10 (“The only actual evidence of movements at that period is of Late Harappans
migrating toward the Ganges plains and towards Gujarat.”); id. (“Cultural continuity from
Harappan to historical times has been increasingly documented in North India, but not in the
South.”).

  29
areas. 83 These facts render the migrations that must be presupposed by the
Dravidian hypothesis highly implausible.
As Figure 5 shows, the Indian subcontinent also contains two separate
river complexes: the Godavari-Krishna-Kaveri complex in the south and the
Indus-Sarasvati-Ganges complex in the north. The Dravidian and Indo-
European language families are, moreover, distributed very precisely around
these two major river systems. When combined with the riverine-agricultural
model of linguistic expansion, these geographical facts are thus sufficient to
explain the current patterns of linguistic variation within India, and there is no
need to posit the highly implausible pattern of migration that is currently
under discussion.
Notice, moreover, that the Dravidian hypothesis presupposes that small
bands of nomadic horse riding Indo-Europeans were able to completely
eradicate almost all of the previous languages in the entire central and
northwestern regions of the Indian subcontinent in just a few centuries, even
though they were never able to do this in the entire south of India—despite
several subsequent millennia of massive efforts at brahmanization and
sanskritization.84 We should therefore ask: what plausible model of linguistic
replacement could explain the complete and rapid replacement of this
language family in one region but not the other? 85
Turning to the existence of the Brahui in the northwestern regions of the
Indian subcontinent, these groups are indeed significant, and they do indeed
speak Dravidian languages. As noted earlier, one might therefore think that
these facts render the Dravidian hypothesis consistent with the significant
pocket criterion. There is, however, now a growing body of evidence to
suggest that these Dravidian speaking people are much more recent
transplants to the area, rather than descendants of the people from the early
Indus Valley Civilization:

[I]n the 1920s, French linguist Jules Bloch demonstrated,


through an analysis of Brahui vocabulary, that the language
reached Baluchistan recently, perhaps at the time of the Islamic
invasions and probably from central India. This thesis was
more recently endorsed by Murray Emeneau, and still more
recently by H. H. Hock. Finally, the linguist and mathematician
Josef Elfenbein confirmed it using a different approach.
According to the French Indo-Europeanist Bernard Sergent,
“the conclusion is radical . . . Brahui reached Baluchistan late,
and can therefore no longer provide proof or even a clue of the
Dravidian-speaking character of the people who lived along the
Indus.”86

                                                                                                               
83
In addition, there have been no significant findings of Indus Valley Civilization artifacts
or settlements in the southern regions of India.
84
See Figure 5 (showing predominance of Dravidian speakers in southern India).
85
I am indebted to discussions with Edwin Bryant for this point.
86
Danino, supra note 80, at 8–9.

  30
Hence, the Dravidian hypothesis ultimately fails to meet the significant pocket
criterion, and can be ruled out on that additional ground.
Studies of the early development of Sanskrit cast even further doubt on
the Dravidian hypothesis. These studies are relevant to the present inquiry for
the familiar reason that Vedic Sanskrit was the Indo-European language
spoken by the ruling groups of northwestern India in or around 1500 BC.87
Various forms of this language were encoded in their Vedic texts, which are
the oldest recorded texts in any Indo-European language, 88 and we can
therefore study these texts to see how early Sanskrit evolved over time.89
Michael Witzel has done just that, and his analyses suggest that the
earliest forms of documented Sanskrit reflect very little, if any, Dravidian
influence.90 Some Dravidian influence begins to appear over time, as the
center of gravity of north Indian civilization moved further to the east and
then south after 1500 BC, but Witzel’s analyses suggest that the earliest
influences on Sanskrit were from a language resembling Munda rather than
Dravidian. 91 When small incoming populations dominate a large local
population linguistically, the typical result is that the dominant language
begins to exhibit significant substratum effects that reflect those early
encounters.92 Hence, this particular pattern of Dravidian influence over time
makes it highly implausible (if not impossible) that Indo-European speakers
first came into contact with large numbers of indigenous Dravidian speakers
in the Indus Valley in or around 1500 BC.93 Indeed, Witzel has rejected the

                                                                                                               
87
See, e.g., BRYANT, supra note 74, at 63-67.
88
RAJEEV VERMA, FAITH & PHILOSOPHY OF HINDUISM 98 (2009) (“The Vedas are . . . the
most ancient wide texts in an Indo-European language, and as such are invaluable in the study
of relative linguistics.”).
89
See, e.g., BRYANT, supra note 74, at 63–67, 101–02.
90
See id. at 101 (“Of substantial importance is Witzel’s discovery . . . that there was no
Dravidian influence in the early Rgveda. He divides the Rgveda corpus into three distinct
chronological layers on linguistic grounds and finds that Dravidian loans surface only in layer
II and III, and not in the earliest level at all . . . .”). Importantly, Witzel also finds very little
influence from Proto-Burushaski, Tibeto-Burmese or Masica’s “Language X.” Witzel, supra
note 78, at 181-183.
91
See id. (noting that in the earliest Sanskrit texts, “[i]nstead, ‘we find more than one
hundred words from an unknown prefixing language’ that is neither Dravidian, Burushaski, nor
Tibeto-Burmese. On the basis of certain linguistic evidences, such as Munda-type prefixes . . .
, [Witzel] prefers to consider the pre-Aryan language an early form of Munda” (citation
omitted)).
92
Gillian Sankoff, Linguistic Outcomes of Language Contact, in THE HANDBOOK OF
LANGUAGE VARIATION AND CHANGE 638, 642 (J. K. Chambers et al. eds., 2004) (“In the case of
a local linguistic group that has been conquered or surrounded by a larger group, slow language
shift may mean many generations of bilinguals, providing ample opportunity for substratum
influence to become established in the language towards which the community is shifting.”);
see also BRYANT, supra note 74, at 76–107 (discussing linguistic substrata in Sanskrit texts).
93
It should be noted that there is still significant controversy among linguists over the
precise amount of Dravidian effects that can be identified in the earliest Sanskrit texts. See
BRYANT, supra note 74, at 76–101. Still, no one has detected any significant early Dravidian
effects, and hence, this level of controversy is itself inconsistent with the kind of large-scale
substratum effects that one would expect if there had been a major displacement of Dravidian
speakers at or around 1500 BC.

  31
Dravidian hypothesis for just this reason.94 (It is worth noting that Witzel also
finds insufficient evidence of languages like Tibeto-Burmese, Proto-
Burushaski, and Masica’s “Language X” to attribute these languages to the
Indus Valley Civilization.)95
There is, finally, an important point about major river names, which
should be familiar from the literature. For some time now, linguists have
observed that the terms for certain major geographic landmarks, such as major
rivers, tend to remain extraordinarily resistant to change, with the result that
they tend to retain the names given to them by their original settled
inhabitants even in the face of massive invasions by new populations who
speak different languages.96 This is why so many of the major rivers in the
United States (such as the Mississippi, the Ohio, the Tennessee, the Arkansas,
and the Missouri) still have Native American names, rather than English or
European names,97 and why so many major rivers in England have names that
derive from pre-Indo-European languages.98 Quite often, major river names
are therefore some of the best indicators of the original languages spoken by
significant populations in an area.99 Significantly, almost all thirty-seven of
the major rivers in the northwestern regions of the Indian subcontinent have
names that are either clearly Indo-European (thirty-four of the thirty-seven) or
at least plausibly reconstructible as Indo-Aryan (two more of the thirty-

                                                                                                               
94
See Michael M. Witzel, Aryan and Non-Aryan Names in Vedic India: Data for the
Linguistic Situation, c. 1900–1950 B.C., in 3 ARYAN AND NON-ARYAN IN SOUTH ASIA:
EVIDENCE, INTERPRETATION AND IDEOLOGY 337, 388, 392–94 (Johannes Bronkhorst & Madhav
Deshpande eds., 1999); see also BRYANT, supra note 74, at 101 (citing Witzel as concluding
that, “[c]onsequently, all linguistic and cultural deliberations based on the early presence of
the Drav. in the area of speakers of IA, are void”).
95
Witzel, supra note 78, at 175-76.
96
See Edwin F. Bryant, Concluding Remarks, in THE INDO-ARYAN CONTROVERSY:
EVIDENCE AND INFERENCE IN INDIAN HISTORY 468, 479 (Edwin F. Bryant & Laurie L. Patton
eds., 2005) (“Place and river names are, to my mind, the singlemost important element in
considering the existence of a substratum. Unlike people, tribes, material items, flora and
fauna, they cannot relocate or be introduced by trade, etc. (although their names can be
transferred by immigrants). Place names tend to be among the most conservative elements in a
language. Moreover, it is a widely attested fact that intruders into a geographical region often
adopt many of the names of rivers and places that are current among the peoples that preexisted
them, even if they change the names of others (i.e. the Mississippi river compared to the
Hudson, Missouri state compared to New England).” (emphasis added)).
97
Danino, supra note 80, at 9 (“[I]n America pre-colonial river names remain common.”);
see also JAMES PEOPLES & GARRICK BAILEY, HUMANITY: AN INTRODUCTION TO CULTURAL
ANTHROPOLOGY 52 (8th ed. 2009) (“Native American peoples had their own names for places
and landscape features, and often these names were the ones that endured and appear on
modern maps.”).
98
See BRYANT, supra note 74, at 98 (“In the hydronomy of England, Celtic names are
fewer in the east, but they are preserved in major rivers. On the other hand, they become more
frequent in the center, and more numerous in the west, a pattern that can be correlated with the
historical data on Saxon settlement, which would have been densest in the east, thereby
explaining the fewer Celtic names in that area.” (citation omitted)); see also Danino, supra note
80, at 9 (“[I]n Europe, many pre-Roman river names have subsisted . . . .”).
99
Witzel, supra note 94, at 368–69 (“Such names tend to be very archaic in many parts
of the world and they often reflect the languages spoken before the influx of later populations.”
(internal citations omitted)).

  32
seven), whereas not one has a Dravidian or clearly Munda name.100 These
facts shed even further doubt on the Dravidian hypothesis.

c. Ruling Out the Munda Hypothesis

We are thus forced to consider the second main possibility: that the early
inhabitants of the Indus Valley spoke dialects of Proto-Munda (the “Munda
Hypothesis”) rather than Proto-Dravidian. The same evidence about river
names should, however, cast some initial doubt on this proposal, and very few
people have argued for it.101 (It should be noted that Witzel’s current proposal
is not that the Indus Valley Civilization spoke dialects of Proto-Munda but
rather that they spoke a now extinct language that resembled Proto-Munda
and that he calls Para-Munda. This possibility has, however, already been
ruled out of the present analysis by significant remaining pocket criterion, and
the next section will present further evidence in support of the significant
pocket criterion.)
The Munda hypothesis also fails the significant pocket criterion,
because—as Figure 5 shows—there are no significant pockets of Munda-
speaking groups in the northwestern portions of the Indian subcontinent. In
addition, all of our current evidence suggests that the Austro-Asiatic language
family, which includes Munda, originated much further to the east, where it
was originally distributed primarily around the Mekong, the Salween, the
Irawwady and (most likely a bit later) perhaps also around the Brahmaputra
and eastern parts of the Ganges. This proposal about the early distributions of
the Austro-Asiatic languages is entirely consistent with the opinion of leading
                                                                                                               
100
See BRYANT, supra note 74, at 99–100 (“[I]n ‘the “homeland” of Rgvedic Indians, the
Northwest, ‘we find [sic] ‘most Rgvedic river names . . . are Indo-Aryan, with the possible
exception of the Kubhá, Satrudrí, and perhaps the Sindhu.’ These latter, according to Witzel,
‘prove a local non-IA substrate [sic]. In view of the fact that Witzel has provided a list of
thirty-seven different Vedic river names, these two or three possible exceptions do not make as
strong a case as one might have hoped. All the rest can indeed be derived from Indo-European
roots. Moreover, other scholars have even assigned Indo-Aryan etymologies to two of these
three possible exceptions.” (citations omitted)); id. at 99 (“None of the river terms [in the
Northwest] are Dravidian. . . . Later texts, however, mention rivers farther east and south from
the Rgvedic homeland that show signs of Munda and Tibeto-Burmese influence in the
northeast, and Dravidian influence toward central India.”). Caldwell has similarly argued that
“the Dravidian loanwords . . . did not consist of the essential aspects of a vocabulary—the
primary words such as verb roots denoting basic actions, pronouns, body parts, and so on.
Such basic terms are the most durable aspect of a language, even when exposed to major
influences from an alien language family. Caldwell argued that had the pre-Aryan population
of North India been Dravidian, it would have preserved at least some of its own primary
Dravidian terms, which would have resurfaced in at least one or two of the northern
vernaculars. This would especially have been the case in the hypothetical scenario involving a
relatively tiny intrusion of Indo-Aryan speakers superimposed upon a massive population of
Dravidian speakers.” Id. at 84.
101
E.g., id. at 78 (“In the case of Sanskrit . . . syntactical innovations were generally held
by most scholars to be due to a local substratum of Dravidian, which triggered this linguistic
subversion (most recently Emeneau 1980; Kuiper 1991).”); WALTER A. FAIRSERVIS, THE
HARAPPAN CIVILIZATION AND ITS WRITING: A MODEL FOR THE DECIPHERMENT OF THE INDUS
SCRIPT 14 (1992) (“We have four language possibilities [for the Harappan civilization]: Munda
an Austro-Asiatic family, largely spoken by tribal people in the eastern portion of the
subcontinent (but note Korku in Central India). Reconstructions of proto-Munda indicate
nothing as complex as the Harappan civilization.” (emphasis omitted)).

  33
experts, who have observed that “[t]he evidence as it is so far established
would suggest that these languages in ancient times as well as now were
situated only in eastern India.” 102 Hence, there are equally compelling
reasons to reject the Munda hypothesis.

d. Some Reasons to Accept the Proto-Indo-European


Hypothesis

Up until this point, we have been assuming that the Indus Valley
Civilization spoke a language that fell into some non-Indo-European language
family, and that Indo-European languages and cultures were therefore
introduced into the Indian subcontinent for the first time only sometime after
the demise of the Indus Valley Civilization (viz., sometime after 1900 BC).
This assumption implied the existence of some significant pockets of people
in or around the northwestern regions of the Indian subcontinent who still
speak languages that fall into this non-Indo-European language family. We
have, however, now eliminated every such possible language family. Hence,
we must revise our initial assumption and consider the very real possibility
that the Indus Valley Civilization spoke dialects of Proto-Indo-European
instead.
This first argument is a straightforward argument from elimination, but
it is distinctive insofar as it draws upon the riverine-agricultural model of
linguistic expansion to amplify the probability of the relevant eliminations.
Before we can draw any affirmative conclusions from an argument of this
form, we must, however, consider whether the claim the Indus Valley
Civilization spoke dialects of Proto-Indo-European suffers from any of the
defects identified with the alternatives. As it turns out, it suffers from none of
them.
For example, not only are there significant pockets of Indo-European
languages in the northwestern region of the Indian subcontinent, but—as
Figure 5 clearly shows—Indo-European languages currently dominate in this
region,103 as they have for all of known history.104 Indeed, this language
family is distributed in a band that precisely follows the Indus-Sarasvati-
Ganges riverbeds—as the riverine-agricultural model would have predicted
for the first major language family to emerge from these regions. Hence, the
Indo-European language family not only meets the significant pocket criterion
but also meets it in spades. This language family is also—and significantly—
the only language family that meets this criterion. These facts provide a new
set of grounds to favor the Proto-Indo-European hypothesis over all of the
other possible alternatives.
                                                                                                               
102
T. Burrow, Sanskrit and the Pre-Aryan Tribes and Languages, in COLLECTED PAPERS
ON DRAVIDIAN LINGUISTICS 319, 328 (1968); see also BRYANT, supra note 74, at 83.
103
See THE ATLAS OF LANGUAGES, supra note 74, at 59 (“The Indo-Iranian branch of
Indo-European is the dominant language family of South Asia.”); see also id. (showing a map
demonstrating Indo-Iranian branch of Indo-European as the dominant language family in much
of India, including the northwest region).
104.
See, e.g., BURJOR AVARI, INDIA: THE ANCIENT PAST: A HISTORY OF THE INDIAN SUB-
CONTINENT FROM C. 7000 BC TO AD 1200, at 61–63 (2007).

  34
If we limit our attention to the Indian subcontinent during the height of
the Indus Valley Civilization, the riverine-agricultural model of linguistic
expansion also suggests that Proto-Indo-European dialects would have been
expanding primarily from the Indus and Sarasvati rivers. Proto-Munda
languages would have been spoken primarily along the eastern part of the
Ganges—but it is important to note that the Ganges was still heavily forested
at this time, and had not yet transformed into a robust center for agricultural
production.105 There is some controversy over when to date the first Proto-
Dravidian speakers in southern India, but—once they arrived—these groups
clearly began to expand around the Godavari, the Krishna, and the Kaveri
Rivers. These three regions would have therefore been geographically
separated from one another, and prior to 1500 BC, it is highly plausible that
these three linguistic groups would have had relatively few early contacts. It
follows that the present proposal is not only consistent with, but could also be
used to explain, Witzel’s findings that neither Dravidian nor Munda had any
really significant effects on the earliest recorded forms of Sanskrit.106
As noted above, the archaeological record suggests that, beginning in
around 1500 BC, the center of northern Indian civilization then began to move
slowly to the east along the Gangetic Plain107 —where, on the present view,
the Indo-European speaking groups would have begun to come into more
significant contact with Munda-speaking groups. The present proposal would
thus explain Witzel’s otherwise puzzling findings that a language resembling
Proto-Munda—rather than Proto-Dravidian—exerted the earliest influences
on Vedic Sanskrit over time.108 The present proposal would also predict an
eastward spread of Indo-European languages along the Gangetic Plain, with
some significant remaining pockets of Munda speakers in the northeastern
regions of the Indian subcontinent. This is precisely what we see in Figure 5,
where various existing pockets of Munda speakers are represented in white
near the eastern portions of the Ganges. Northern Indian civilization
subsequently began to expand toward the south, and we know that these
events resulted in a number of increased patterns of reciprocal influence
between the north and the south.109 Hence, the current proposal would also
                                                                                                               
105
See, e.g., ALLCHIN & ALLCHIN, supra note 3, at 225 (“The modern landscape as we
move eastward [along the Ganges] is increasingly manmade. Anciently, much of the land was
under forest, and initially there must have been a need for large-scale clearance and
deforestation . . . . It would be a mistake, however, to think that the forest clearance was ever
wholesale. It is probable that originally there would have been only limited clearance of the
land required for cultivation around any settlement . . . . What we see today in the Ganges
plains is very much the final stage, when continuing population growth and pressure has led to
almost complete extinction of the forests.”).
106
Id. (finding “no Dravidian loan words at all” in the early Rig Vedic period and only
“some three hundred words” that are non-Indo-Aryan and derive “from one or more unknown
languages . . . .”).
107
See generally ALLCHIN & ALLCHIN, supra note 3, at 206–22 (“Changing Scenes: Indus
to Ganges”); id. at 223–61 (“The Second Urbanization”); id. at 215 (“The major change [in the
post-urban period in the Sarasvati valley] is the great increase in the number of settlements
spreading out across the plains in the eastern part of the area, in a belt between 100–200 km in
width, running from north-west to south-east, following the edge of the Himalayan foothills.”).
108
Id.at 175-80.
109
B.K. Thapar, The Harappan Civilization: Some Reflections on Its Environments and
Resources and Their Exploitation, in HARAPPAN CIVILIZATION: A RECENT PERSPECTIVE
(Gregory L. Possehl ed., 2d rev. ed. 1993; see also ALLCHIN & ALLCHIN, supra note 3, at 245

  35
explain why the early Sanskrit texts exhibit a subsequent set of influences
from the Dravidian language family—but only after these early effects from a
language resembling Proto-Munda.
Finally, the evidence from major river names—which was discussed
above—strongly supports the claim that the Indus Valley Civilization spoke
dialects of Proto-Indo-European, because nearly all of the major river names
from the northwest parts of the Indian subcontinent are Indo-European. These
facts therefore provide the current linguistic proposal with yet another layer of
justification.
When viewed as a whole, this argument from elimination thus
encapsulates a highly interconnected set of reasons that strongly favor the
claim that the Indus Valley Civilization spoke dialects of Proto-Indo-
European over all of the other possibilities. The basic structure of this
argument, along with the new contributions made by the significant pocket
criterion, are laid out in Figure 6 below.

                                                                                                                                                                                                                                                                       
(“[W]e believe that, in terms of the rise of Indian Civilization, the Ganges valley played a
distinct and special role. By this we mean that the Ganges valley appears to have been the seat
of the composition of a large part of the voluminous literature of the late Vedic period, and of
the subsequent period of the Brahmanas and Upanishads, not to mention its having been the
homeland of Buddhism and Jainism; and the region which produced the two great epics, the
Mahabharata and the Ramayana. Each one of these elements later became disseminated
throughout South Asia, and it could be argued that each was a facet of the spread of Indian
civilization as a whole. We do not, however, mean that the process was confined to the Ganges
plains. The newly emerging urban societies were in themselves responsible for the creation of
all sorts of outward thrusts and stimulations which led to the spread of cities in all directions,
and eventually to almost every part of South Asia.”).

  36
3. An Objection Based on the Possibility of Complete Linguistic
Replacement via Conversion

The last section argued that if one assumes that the significant pocket
criterion is valid, then the Indus Valley Civilization almost certainly spoke
dialects of Proto-Indo-European. The arguments for this conditional
linguistic proposition are—I believe—quite solid, but there is still the
possibility that one might try to reject this linguistic conclusion by rejecting
the significant pocket criterion.
As the reader will remember, the significant pocket criterion states that
whatever invasions or migrations might have taken place in northwestern
India in the second millennium BC, the major river systems of the Indus
Valley should have already produced a sufficiently monumental and highly
coordinated language family in the region that this language family could not
have plausibly been completely replaced. Instead, significant pockets of this
language family should still exist in the northwestern portions of the Indian
subcontinent. The Indo-Aryan invasion or migration theorist needs to reject
both of these propositions and is therefore assuming, in effect, that a relatively
small group of Indo-Aryan invaders or migrants were able to completely
eradicate the pre-existing language (or languages) of the Indus Valley
Civilization in a very short time beginning in or around 1500 BC.
In order to assess the plausibility of the kind of complete linguistic
replacement that is being presupposed by the Indo-Aryan invasion or
migration theorist, this section will engage in a comprehensive review of the
historical record from around the world and over the course of known history.
This examination will show is that there is not one case anywhere in this
extraordinarily lengthy record in which a foreign group has been able to
completely eradicate a major language family in its original region of riverine
expansion through linguistic conversion once that major language family has
reached equilibrium in accordance with the riverine-agricultural model of
linguistic expansion. (I will say that a language family has “reached
equilibrium” within a particular riverine setting in accordance with the
riverine-agricultural model of linguistic expansion once a single language
family has grown to dominate in one of these riverine regions through a
process that involves not only agricultural forms of subsistence but also the
development of some urban or incipient urban trading centers and networks
on that basis.) This is true even though the world historical record exhibits
numerous (and sometimes even near constant) invasions of, and migrations
into, these same riverine regions by foreign groups who have spoken foreign
languages and acted as political and cultural elites.
As shown below, complete linguistic replacement has only occurred in
these particular riverine settings in a more narrow set of circumstances, which
neither apply to our present topic nor provide genuine exceptions to the rule
just stated. These are circumstances where either (1) the replacing language is
a closely related member of the very same language family or (2) the

  37
replacement is part of the initial process whereby an indigenous language
family first reaches equilibrium in accordance with the riverine-agricultural
model of linguistic expansion. This second class of cases obviously includes
those in which (2´) the replaced languages were spoken by primarily nomadic
and/or non-agriculturalist groups who would not yet fall within the ambit of
the riverine-agriculturalist model of linguistic expansion and would have
therefore tended to exhibit much greater amounts of linguistic diversity. In
addition, there may be some poorly recorded cases involving semi-
agriculturalist populations in the New World where the processes of linguistic
replacement were (3) part of a larger process of near complete population
replacement. These last cases are not directly relevant to our present topic,
however, because there do not appear to have been any large-scale population
replacements in northwestern Indian in the 2d millennium BC,110 and because
the modern Indo-Aryan invasion or migration theorist does not claim
otherwise.111
This section will therefore establish that there are incredibly strong
empirical reasons to accept the significant pocket criterion, and—along with
it—the linguistic claim that the Indus Valley Civilization most likely spoke
dialects of Proto-Indo-European. Toward the end of this section, I will also
briefly discuss why the Indo-Aryan invasion or migration theorist’s
assumptions might have once seemed more plausible than I believe they really
are.

a. On the Historical Persistence of the Major Riverine


Languages in and around the Indian Subcontinent

Perhaps the best place to begin the present analysis is by looking at the
Indus Valley region itself. For all of known history, Indo-Iranian languages
have predominated in this region, with Iranian languages tending to dominate
to the west of the Indus and Indo-Aryan languages tending to dominate to the
east.112 If we focus our attention on the regions to the east of the Indus, we
will see that this region has been invaded on numerous occasions and over
many millennia by foreign groups who have spoken either non-Indo-European
languages or Indo-European languages that are distinct from Indo-Aryan.
Often these groups have come from the northwest, and the relevant list would
have to include incursions by at least the following empires: the
Persian/Achaeminid Empire (IE: Iranian Branch), the Hellenistic Empire (IE:
Greek Branch), the Greco-Bactrian Empire (IE: Greek), the Indo-Scythian
Empire (IE: Iranian), the Kushana Empire (IE: most likely Iranian), the
Sassanid Empire (IE: Iranian), the Hepthaelite Empire (IE: most likely
                                                                                                               
110
See BRYANT, supra note 74, at 231-236.
111
See, e.g., Witzel, supra note 78, at 168-69 (citing with approval the position of
Lamberg-Karlovsky to the effect that “the indigenous people, although in the majority, adopted
[the] language” of incoming Indo-Aryans from the steppes (emphasis added)).
112
See, e.g., ETHNOLOGUE, supra note 12, at 816 (Map of Northern Pakistan) (showing
Iranian languages dominated the regions to the west of the Indus and Indo-Aryan languages
dominating to the east); id. at 817 (Map of Southern Pakistan) (showing Iranian languages
dominated the regions to the west of the Indus and Indo-Aryan languages dominating to the
east); id. at 365-407 (listing languages of India along with native population counts).

  38
Iranian), the Ghaznavid Empire (Turkic peoples speaking predominantly
Arabic and Iranian languages), the Delhi Sultanate (Turkic peoples speaking
Turkish, Arabic and Iranian languages), the Moghul Empire (Mongolian
groups speaking Mongolian, Turkic and Iranian languages), the British
Empire (IE: English (Germanic)). Significantly, however, not one of these
events has changed the basic predominance of Indo-Aryan languages among
the native speakers of northern India east of the Indus.113
To the west of the Indus, only those invading groups that spoke non-
Iranian languages will be relevant to the present analysis, and so the above list
would have to be shortened. Still, not one member of this shorter list has
altered the basic predominance of Iranian languages on the Indian
subcontinent in the regions west of the Indus.114
Turning to the Ganges, the archaeological record suggests that large-
scale agricultural production began much later around this river than in the
Indus Valley (due in part to the fact that the Ganges was heavily forested for a
much longer period and needed to be cleared before it could support large-
scale agricultural activities).115 The first major language family that shows up
in the historical record as having reached equilibrium along the Ganges was,
however, the Indo-Aryan branch of the Indo-European language family,
and—significantly—no subsequent invasions or migrations have altered the
basic predominance of Indo-Aryan languages in these regions either.116
These facts are, in fact, part of a broader pattern that has persisted for all
of known history in northern India. J.M. Roberts—who is one of the leading
experts in world history—has reviewed the long record of invasions of
Northern India and has identified two patterns that stand out most clearly. He
describes, first, “the importance of the north-western frontier [of the Indian
Subcontinent] as a cultural conduit” (as reflected in the series of invasions
themselves), but also, second, what he calls “the digestive power of Hindu
civilization.”117 Roberts explains this second pattern as follows: “None of the
invading peoples could in the end resist the assimilative power India always
showed. New rulers were before long ruling Hindu kingdoms . . . , and
adopting Indian ways.”118 The linguistic facts under discussion here would
therefore appear to be part of a much broader socio-cultural feature of
northwestern India (at least for all of recorded history). These facts thus cast
doubt on the type of complete and unrecorded linguistic replacement that is
being presupposed by the Indo-Aryan invasion or migration theorist.
The first major language family to show up in the historical record as
having reached equilibrium around the Krishna, the Kaveri, and the Godavari
rivers in the South is Dravidian. 119 We also know that for millennia
(beginning in the first millennium BC), Indo-Aryan groups from the North
                                                                                                               
113
See, e.g., ETHNOLOGUE, supra note 12, at 816, 817, 365-407.
114
Id. at 816-17.
115
ALLCHIN & ALLCHIN, supra note 3, at 225.
116
See ETHNOLOGUE, supra note 12, at 365-407; Figure 5, supra.
117
J.M. ROBERTS, THE NEW HISTORY OF THE WORLD 429 (Oxford University Press 2003).
118
Id.
119
See, e.g., FRAN SOUTHWORTH, LINGUISTIC ARCHAEOLOGY OF SOUTH ASIA 325-26
(2005) (suggesting that Proto-Dravidian languages are associated with the predominant South
Indian speech communities of the mid-third millennium BC, which subsequently underwent
Neolithic expansions that helped to expand this language family).

  39
have been engaging in massive and well-attested efforts to brahminize and
sanskritize the South.120 Once again, however, none of these efforts has been
able to change the basic predominance of Dravidian languages among the
native speakers in the South.121 Whatever extraordinary process of linguistic
replacement the Indo-Aryan invasion or migration theorist is assuming to
have taken place in the North is thus one that must have also—for some
reason—proven equally ineffective in the South. But then we should ask:
why would there be this kind of difference between these two regions, and
why would this difference track the major riverine topographies depicted in
Figure 5 so perfectly? (As Figure 5 shows, the Indo-European and Dravidian
language families are distributed precisely along the Indus-Ganges and
Krishna-Godavari-Kaveri river systems, respectively—just as the riverine-
agricultural model of linguistic expansion would have predicted if these
language families had originally expanded from these regions.) These facts
cast additional doubt on the assumptions of the Indo-Aryan invasion or
migration theorist.
Hence, all of the historical evidence that we have from the Indian
subcontinent suggests that, once a major linguistic phenomenon has emerged
and reached equilibrium around one of its major river systems, that language
family has proven remarkably resilient to replacement. These facts provide
strong support for the significant pocket criterion, and they are especially
significant because they come from the very regions that produced the
language (or languages) of the Indus Valley Civilization.
Before moving on, we should notice that a related point can be made
about neighboring Iran and the regions that once constituted ancient Bactria.
Although it is true that we lack direct evidence of the languages that might
have been spoken in these regions during most of human prehistory, we do
know that, during all historical periods, these regions have been dominated by
Iranian speaking groups.122 The regions that make up modern day Iran (along
with modern Tajikistan and northern Afghanistan) have also been invaded and
conquered by numerous groups speaking non-Iranian languages, of which the
following groups must surely be included: the Hellenistic Empire (IE: Greek),
the Seleucid Empire (IE: Greek), the Greco-Bactrian Empire (IE: Greek), the
Umayyad Caliphate (Arabic), the Abbasid Caliphate (Arabic), the Tahirid
Dynasty (Arabic), the Buyid Dynasty (Arabic), the Ghaznavid Empire
(Turkic), the Seljuq Empire (Turkic), the Il-Khanate (Mongolian), the
Timurid Dynasty (Turkic) and the Qajar Dynasty (Turkic). Once again,
however, not one of these events has changed the basic predominance of the
Iranian branch of Indo-European among the native speakers in these
regions—which include the core of ancient Bactria.123
                                                                                                               
120
See, e.g., P.R. SRINIVASAN & S. SANKARANARAYANAN, THE COHESIVE ROLE OF
SANSKRITIZATION (Oxford University Press 1989); M. Srimanarayana Murti, Linguistic
Convergence: Indo-Aryanization of Dravidian, 53 LINGUA 199-220 (1981).
121
See ETHNOLOGUE, supra note 12, at 365-407; Figure 5, supra.
122
See THE ENCYCLOPEDIA OF INDO-EUROPEAN CULTURE 307-09 (J.P. Mallory & D.Q.
Adams 1997).
123
See ETHNOLOGUE, supra note 12, at 452-459 (listing languages of Iran); id. at 774 (Map
of Afghanistan showing the country dominated by Iranian speakers, with the exception of a
small pocket of Altaic speakers in the North, and even smaller pockets of Brahui and Indo-
Aryan speakers in the south and northeast, respectively); id. at 805 (Maps of Kyrgyzstan and

  40
b. On the Historical Persistence of the Major Language Families
Produced by All of the Other Ancient Riverine Civilizations (in
Egypt, Mesopotamia and China)

If we look beyond the Indian subcontinent, we know of three other


important riverine systems that produced some of the very first large-scale
human civilizations in the world. These are the Nile (which gave rise to the
ancient Egypt Kingdoms); the Tigris and the Euphrates (which gave rise to a
series of ancient Mesopotamian civilizations); and the Yellow and Yangtze
Rivers (which gave rise to ancient Chinese civilization). Given that these
regions produced the only developments in large-scale social complexity and
urbanism in early human prehistory that are comparable to those of the Indus
Valley Civilization, these regions provide us with the next closest analogue to
our present topic.
As it turns out, all three of these regions can also be associated with the
early expansions of some major contemporary language family around their
major rivers. All three have also seen their fair share of foreign invasion and
foreign rule. Another way to test the plausibility of the significant pocket
criterion is thus to examine the historical record from these other regions. As
I will now show, this record ultimately provides solid and uniform support for
the significant pocket criterion.
Starting with ancient Egypt, we have hieroglyphic evidence that goes
very far back (going back at least as far as 3200 BC) and shows that Afro-
Asiatic languages were spoken in this region.124 Afro-Asiatic languages are
also the first that show up in the historical record as dominating the larger
Egyptian regions along the Nile, and the ancient Egyptian Dynasties were all
Afro-Asiatic speaking.125 These native dynasties lasted fairly continuously
from about the 32d to the 18th centuries BC, and then rose again in power
from the 16th to the 11th century BC.126 For large portions of the last three and
half millennia, however, some or all of Egypt has been conquered and ruled
by foreign groups who have often spoken non-Afro-Asiatic languages. The
relevant list would have to include at least: the Hyksos Empire (disputed but
possibly non-Afro-Asiatic language), the Hittite Empire (IE: Anatolian), the
Achaeminid Empire (IE: Iranian), the Hellenistic Empire (IE: Greek), the
Ptolemaic Empire (IE: Greek), the Roman Empire (IE: Celtic-Italic), the
Byzantine Empire (IE: Greek), the Sassanid Empire (IE: Iranian), the
                                                                                                                                                                                                                                                                       
Tajikistan, showing predominance of Indo-European speakers along the major rivers of
Tajikistan); id. at 832 (Map of Turkmenistan and Uzbekistan, showing significant pockets of
Indo-European speakers in the relevant regions of Turkmenistan);
124
AFRICAN INTELLECTUAL HERITAGE: A BOOK OF SOURCES 266 (Molefi K Asante, Abu
Shardow eds., 1996) (noting that “the spoken Egyptian language lasted about 5000 years, since
the oldest texts in hieroglyphics date back to around 3000 BC, and the most recent to the year
393.”); ENCYCLOPEDIA OF THE ARCHAEOLOGY OF ANCIENT EGYPT 325-28 (Kathryn Bard ed.,
1999); JAMES P. ALLEN, MIDDLE EGYPTIAN: AN INTRODUCTION TO THE LANGUAGE AND
CULTURE OF HIEROGLYPHS 1 (identifying earlier writing going back to 3200 BC).
125
ENCYCLOPEDIA OF THE ARCHAEOLOGY OF ANCIENT EGYPT, supra note __, at 325-28.
126
THE OXFORD HISTORY OF ANCIENT EGYPT 57-363 (Ian Shaw, ed., 2003) (describing rise
of the first Egyptian states around 3200 BC until the Third Intermediate Period, which began in
1069 BC).

  41
Ottoman Empire (IE: Turkish), the French Occupation of 1798-1801 (IE:
Celtic-Italic), the Rulership of Muhammad Ali (IE: Albanian), and the British
Empire (IE: English (Germanic)). Importantly, however, not one of these
events has altered the basic predominance of Afro-Asiatic languages among
the native speakers of Egypt around the Nile.127
Interestingly enough, the only invasions in the historical record that have
generated anything like a complete linguistic replacement were the Arab
invasions that began in the 7th century AD. These invasions eventually led to
the replacement of ancient Egyptian with an early ancestor of modern
Egyptian Arabic. 128 It needs to be emphasized, however, that these two
languages were closely related members of the very same language
family129—a fact that would have greatly aided with mutual intelligibility and
linguistic conversion. Hence, this single seeming exception only helps to
prove the strength of the general rule, which—based on the entire historical
record that we have looked at so far—does not appear to allow for complete
linguistic replacements by distinct language families through linguistic
conversion once a prior language family has reached equilibrium in a major
riverine setting.
The record from Egypt is also significant for another reason. It shows
that—between about 3200 BC until about 672 BC—the native Afro-Asiatic
speaking groups in this region underwent several developments towards social
complexity with large-scale political unification (often called the periods of
the “Kingdoms”).130 These periods were nevertheless interspersed, from time
to time, with others that were marked by significant declines in social
complexity, increased forms of regionalization and the collapse of large-scale
political authority (often called “Intermediate Periods”).131 After the final
Intermediate Period, which ended in about 672 BC, there was one more brief
flowering of native Egyptian rule, but—beginning in about 525 BC—Egypt
then came under almost consistent foreign rule for millennia.132 I mention
these points because the various collapses of the Egyptian Kingdoms are in
many ways comparable to the collapse of the Indus Valley Civilization. It is
therefore important to notice that the Afro-Asiatic language family (which is
the first to have reached equilibrium around the Nile in accordance with the

                                                                                                               
127
See ETHNOLOGUE, supra note 12, at 728-729 (showing predominance of Afro-Asiatic
languages in Egypt); id. at 728-29 (Map of Sudan, showing Afro-Asiatic languages (versions of
Arabic) as the only widespread and official languages in the region).
128
JAMES P. ALLEN, MIDDLE EGYPTIAN: AN INTRODUCTION TO THE LANGUAGE AND
CULTURE OF HIEROGLYPHS 1 (2d ed., Cambridge University Press 2010) (“Egyptian first
appeared in writing shortly before 3200 BC and remained in active use until the eleventh
century AD. This lifespan of more than four thousand years makes it the longest continually
attested language in the world. Beginning with the Muslim conquest of Egypt in AD 641,
Arabic gradually replaced Egyptian as the dominant language in Egypt. Today, the language of
Egypt is Arabic.”).
129
See CONCISE ENCYCLOPEDIA OF LANGUAGES OF THE WORLD 12 (Keith Brown & Sarah
Ogilvie eds. 2009) (listing six branches of the Afro-Asiatic language family, and noting that
Arabic and spoken Egyptian are members of a single branch).
130
THE OXFORD HISTORY OF ANCIENT EGYPT, supra note 126, at 57-324.
131
Id. at 108-136 (“The First Intermediate Period (c.2160-2055 BC), 172-206 (“The
Second Intermediate Period (c.1650-1550 BC), 324-63 (“The Third Intermediate Period
(1069—664 BC).
132
Id. at 364-436 (describing periods of foreign rule in Egypt).

  42
riverine-agricultural model of linguistic expansion) has proven far more
persistent than any of the political regimes in this region. This language
family has also persisted and grown through a series of ebbs and flows in the
social complexity of the region. Hence, there is no reason to think that the
collapse of the Indus Valley Civilization would have necessarily led to a
replacement of its languages.
Turning to Mesopotamia, our earliest written records go very far back
(into the 4th millennium BC)133 and suggest that there were—at this very early
time—two main linguistic groups that had begun to emerge from different
parts of the Tigris and the Euphrates. In the northern regions, Afro-Asiatic
languages predominated (among important groups like the Akkadians),
whereas in the southern regions, Sumerian (which is an extinct linguistic
isolate) was spoken among the ancient Sumerians.134 Because our written
records go so far back in time in this region, we have a pretty good sense
about how this larger region eventually coalesced around a single major
language family—which turned out to be Afro-Asiatic. This occurred during
the 3d millennium BC via a long process that involved increasingly close
interactions and widespread cultural integration between these two indigenous
linguistic groups around the Tigris and the Euphrates.135 These two groups
shared a very deep and intimate history of cultural interpenetration, and the
process itself included almost a millennium of widespread bilingualism—after
which time Sumerian was still used in many liturgical settings.136 Although
Afro-Asiatic languages eventually replaced Sumerian by the end of the 3d
millennium BC, this transformation should therefore be understood as part of
the initial process whereby a single language family first reached equilibrium
around the Tigris and the Euphrates in accordance with the riverine-
agricultural model of linguistic expansion. In any event, this process was
decidedly not one in which a foreign group was able to completely replace a
major language family that had already reached equilibrium around one of
these major riverine settings.
Once these Afro-Asiatic languages had reached equilibrium within
Mesopotamia, they have, moreover, showed precisely the same formidable
resistance to replacement that we have seen in all of the other major riverine
regions discussed so far. The historical record suggests Afro-Asiatic speaking
groups in Mesopotamia have been invaded and/or ruled by numerous non-
Afro-Asiatic speaking groups. The relevant list of invaders would have to
include at least: the Hittite Empire (IE: Anatolian), the Mitanni Empire (IE:
Indo-Aryan), the Kassite Empire (IE: Iranian), the Achaemenid Empire (IE:
Iranian), the Hellenistic Empire (IE: Greek), the Seleucid Empire (IE: Greek),
the Parthian Empire (IE: Iranian), the Sassanid Empire (IE: Iranian), the
Ottoman Empire (Turkish), and the American Empire (IE: Germanic
(English)). Once again, however, not one of these events has changed the

                                                                                                               
133
JEAN-JACQUES GLASSNER, THE INVENTION OF CUNEIFORM: WRITING IN SUMER xi (2003).
134
See NICHOL POSTGATE, EARLY MESOPOTAMIA: SOCIETY AND ECONOMY AT THE DAWN
OF HISTORY 14-18 (2d ed. 1994).
135
Edward Y. Oshido, Bilingualism: A Salient and Dynamic Feature of Ancient
Civilizations, 14 Mediterranean Language Review 71-97 (2002).
136
Id.

  43
basic predominance of Afro-Asiatic languages among the native speakers in
these regions.137
We should also pause for a moment to take special notice of the events
relating to the Hittite and Mitanni empires. Both of these groups were Indo-
European speaking groups, and both conquered parts of Mesopotamia at about
the same time that Indo-Aryan groups from the steppes are said to have first
conquered the northwestern parts of the Indian subcontinent (viz., somewhere
between the early to middle of the 2d millennium BC).138 The Hittites and the
Mitanni are also thought to have employed many of the same military and
other technologies that the new Indo-Aryan groups in northwestern India are
said to have employed, and the Mitanni even shared nearly identical religious
views and social structures.139 It is therefore noteworthy that neither the
Hittites nor the Mitanni were able to eradicate the pre-existing (Afro-Asiatic)
languages that had emerged from around the Tigris or the Euphrates. In fact,
Hittite is now an extinct language,140 and there are no Indo-Aryan speaking
descendants of the Mitanni left in this region.141 Hence, complete linguistic
replacement does not seem to be part of the model of these other early Indo-
European conquerors.
Turning, finally, to ancient China, we know that—as far back as
historical records go—Sino-Tibetan languages have always dominated the
regions around the Yellow and Yangtze rivers.142 Over the long course of
Chinese history, these indigenous Sino-Tibetan groups have also had a series
of complex and sometimes antagonistic relations with various Altaic-speaking
groups to the North—such as the Turks, the Mongols and the Manchurians.143
Some of these Altaic-speaking groups have also conquered and ruled over
significant portions of China from time to time (and especially in the second
millennium AD).144 A list of the most significant such events would have to
include at least the following: the Jin and Liao kingdoms (in northeast China
in the 10th and 11th centuries), the Mongolian “Yuan” dynasty (which ruled
over the greater part of China during much of the 13th century), and the
Manchurian “Qing” Dynasty (which ruled over the greater part of China from
the 17th century into the 20th century). Once again, however, not one of these
events has altered the basic predominance of Sino-Tibetan languages among
the native speakers in China around the Yellow and Yangtze rivers.145

                                                                                                               
137
See ETHNOLOGUE, supra note 12, at 785 (Map of Iraq, showing the region dominated
by Afro-Asiatic languages (dialects of Arabic) with some pockets of Indo-European languages
(like Kurdish) in the northeast); id. at 803 (Map of Jordan and Syria, showing these regions
dominated by Afro-Asiatic languages (mostly dialects of Arabic or Aramaic) with some
pockets of Indo-European languages (like Kurdish and Armenian) in the extreme north).
138
See BRYANT, supra note 74, at 68-72, 135-138; J.M. ROBERTS, supra note 117, at 83-
84, 106-107, 225.
139
Id.
140
See BRYANT, supra note 74, at 68-75.
141
ETHNOLOGUE, supra note 12, at 457-59 (listing only one Indo-Aryan language—
namely, Middle Eastern Romani—as still existing as a native language in Iraq).
142
THE CAMBRIDGE HISTORY OF ANCIENT CHINA: FROM THE ORIGINS OF CIVILIZATION TO
221 B.C., at 74-81 (1999).
143
J.M. ROBERTS, supra note 117, 444-465.
144
Id.
145
See ETHNOLOGUE, supra note 12, at 779-81 (Map of China, showing Sino-Tibetan
languages as dominating the regions through which the Yellow and Yangtze rivers flow,

  44
We have now examined all four of the regions that gave rise to the very
first ancient human civilizations, and which were also—incidentally—major
riverine topographies. Two of these regions played a critical role in the
expansion of the Afro-Asiatic language family, which is the fourth largest in
the world,146 and a third played a critical role in the expansion of the Sino-
Tibetan language family, which is the second largest in the world.147 The
relatively large sizes of the Afro-Asiatic and Sino-Tibetan language families
are very plausibly attributable—at least in part—to the great time depth of
their expansions within their respective riverine locales, along with the
extraordinary records of these regions in producing some of the earliest large-
scale human civilizations in the world. As noted earlier, there is only one
other language family that is bigger than these other two (and is, in fact,
bigger than both combined): the Indo-European language family.148 There is
also only one other remaining major seat of ancient civilization that is
comparable to these others (and was, in fact, able to produce a civilization that
was bigger than them all): the Indus Valley region.149 If the Indus Valley
Civilization spoke dialects of Proto-Indo-European, then this region would
therefore align perfectly with the linguistic and archaeological records of all
of the other major seats of ancient civilization.
The Indo-Aryan invasion or migration theorist is, by contrast, effectively
assuming that the Indus Valley Civilization deviated from all of these other
earliest civilizations in a rather extraordinary way. On the traditional view,
the Indus Valley Civilization would have been the only one of the ancient
seats of human civilization to fail to leave a major linguistic phenomenon in
its original location of riverine expansion. In fact, the Indus Valley
Civilization would have failed to leave any surviving trace of its language at
all in its original region of riverine expansion. Given the historical record of
persistence of the major language families that have expanded in accordance
with the riverine-agricultural model of linguistic expansion, I do not believe
this assumption is very plausible.

c. On the Historical Persistence of the Remaining Major Language


Families in their Respective Regions of Riverine Origin

None of the other major language families from our list of the top eleven
is nearly as large as the Indo-European language family,150 and none can be
traced to developments with as formidable an ancient history of social
complexity as in the Indus Valley. Once having reached equilibrium within a
                                                                                                                                                                                                                                                                       
although some Altaic languages appear further to the North, and some other language
families—like Tai-Kadai and Austronesian appear in the far south).
146
Id. at 27.
147
Id.
148
Id.
149
ALLCHIN & ALLCHIN, supra note 3, at 153–54; MCINTOSH, supra note 319, at 4 (“In the
third millennium [the Harappan] civilization flourished over an area far larger than those of its
contemporaries in Mesopotamia and Egypt.”).
150
See ETHNOLOGUE, supra note 12, tbls. 4 & 5, at 27-32 (showing that the next largest
language family—Sino-Tibetan is less than ½ the size of Indo-European, whereas the next
largest—Afro-Asiatic—is about 1/7th the size of it. Uralic—which is the 11th largest language
family—is about 1/13th the size of the Indo-European language family).

  45
major riverine setting, all of these major language families have nevertheless
shown a very deep resistance to replacement. These facts provide yet another
layer of support for the proposal that the Indus Valley Civilization’s language
should have proven deeply resistant to displacement.
We know, for example, that the Niger-Congo language family’s major
expansions are associated with the first significant spread of agricultural
activities around the Niger, the Congo and the Zambezi rivers.151 These
expansions did lead to the displacement of many pre-existing languages in
these regions (including, most prominently, many Khoisan languages that
were once spread throughout much more of Africa but are now limited to a
much smaller region).152 We should, however, remember that these pre-
existing groups were primarily nomadic and/or non-agriculturalist.153 Hence,
this process should be understood as part of the original dynamics by which a
single language family first reached equilibrium around these major river
systems in accordance with the riverine-agricultural model of linguistic
expansion.
After the Niger-Congo language family had reached this equilibrium, the
record shows that many parts of Sub-Saharan Africa have been subjected to
long periods of European colonial rule.154 Importantly, however, not one of
these events has altered the basic predominance of Niger-Congo languages
among the native speakers throughout most of Sub-Saharan Africa.155 This is
true even though the official or national languages in many of these regions is
now Indo-European.156
The Uralic and Altaic language families can be handled more quickly.
The earliest expansions of the Uralic language family are typically traced to

                                                                                                               
151
CHRISTOPHER EHRET, THE CIVILIZATIONS OF AFRICA: A HISTORY to 1800, at 89 (2002).
152
RICHARD J. REID, A HISTORY OF MODERN AFRICA 13 (2d ed. 2012).
153
Id. (“Khoisan, perhaps the oldest of all the continents language families, is associated
with the pastoralist and nomadic hunter-gatherers of southern Africa.”).
154
Id. at 183-244 (discussing long history of colonialism in Africa).
155
See ETHNOLOGUE, supra note 12, at 678 (Angola), 679 (Benin), 680 (Botswana), 681
(Burkina Faso), 682-686 (Cameroon), 691 (Congo), 692 (Côte D’Ivoire), 694-697 (Northern
and Southern Democratic Republic of Congo), 700 (Equatorial Guinea and Gabon), 701
(Ghana), 702 (Guinea and Guinea-Bissau), 703 (Kenya), 704 (Lesotho and Swaziland), 705
(Liberia), 706-707 (Mali), 708 (Malawi), 710 (Mozambique), 711 (Namibia), 712-713 (Niger),
714-724 (Nigeria), 725 (Senegal and Gambia), 726 (Sierra Leone), 730-731 (Tanzania), 732
(Togo), 733 (Uganda), 734 (Zambia), 725 (Zimbabwe). The closest thing to an exception
would appear to be South Africa, which—while presently dominated by Indo-European
languages—still has a number of indigenous languages in the region. See id. at 704.
156
See id. at 678 (Angola’s official language is Portuguese), 679 (Benin: French), 680
(Botswana: English), 681 (Burkina Faso: French among others), 682-686 (Cameroon: English
and French), 691 (Congo: French), 692 (Côte D’Ivoire: French), 694-697 (Northern and
Southern Democratic Republic of Congo: French), 700 (Equatorial Guinea and Gabon: Spanish
and French, and French, respectively), 701 (Ghana: English), 702 (Guinea and Guinea-Bissau:
French and Portuguese, respectively), 703 (Kenya: English among others), 704 (Lesotho and
Swaziland: English, among others), 705 (Liberia: English), 706-707 (Mali: French), 708
(Malawi: English among others), 710 (Mozambique: Portuguese), 711 (Namibia: English),
712-713 (Niger: French), 714-724 (Nigeria: English among many others), 725 (Senegal and
Gambia: French and English, respectively), 726 (Sierra Leone: English), 730-731 (Tanzania:
English, among others), 732 (Togo: French), 733 (Uganda: English), 734 (Zambia: English),
725 (Zimbabwe: English).

  46
locations around the Volga, the Ob and the Irtysh rivers.157 Despite the well
known and near constant barrage of Indo-European and Altaic groups who
have swept through the steppes and brought foreign rule and foreign
languages with them,158 there are still significant pockets of Uralic speaking
groups around the Volga, the Ob and the Irtysh.159 With regard to the Altaic
language family, the earliest expansions of this language family can be traced
to the major rivers systems of Siberia and Mongolia (viz., near the Yenesei,
the Lena, the Amur and the Vilhuy). 160 Once having emerged in these
regions, this language family has remained persistent in these regions despite
significant periods of Russian rule.161
In Oceania, the Austronesian language family is typically associated
with the first spread of large-scale agriculture through this region.162 Once
established, this language family—like all of the others discussed thus far—
has resisted displacement from a long series of colonial invasions. 163 It
should be acknowledged, however, that in some parts of Oceania, these
Austronesian-speaking groups arrived after a somewhat earlier spread of
smaller scale agricultural activities (which was based primarily on taro and
banana) by groups who spoke languages that fall into the Trans New Guinea
Continuum.164 Significantly, however, speakers of these Trans New Guinea

                                                                                                               
157
THE ENCYCLOPEDIA OF INDO-EUROPEAN CULTURE, supra note 122, at 309 (The origins
of the Uralic language family “are variously set to either the regions immediately west or east
of the Southern Urals.”).
158
NICHOLS, supra note 11, at 275.
159
See ETHNOLOGUE, supra note 12, at 824 (Map of Western Asian Russian Federation,
showing significant pockets of Uralic speakers clustered around the northern parts of the Ob
and Irtysh rivers, as well as the far northern parts of the Yenisei); id. at 848 (Map of European
Russian Federation, showing significant pockets of native Uralic speakers at several locations
near the upper Volga river, as well as along numerous other more minor riverine regions to the
North).
160
Peter B. Golden, Some Thoughts on the Origins of the Turks and the Shaping of Turkic
Peoples, in VICTOR MAIR, CONTACT AND EXCHANGE IN THE ANCIENT WORLD 136, 139 (2006)
(describing theories that locate the origins of the Altaic language family in either western
Siberia or the Mongolo-Manchurian region).
161
Id. at 824-825 (Map of Western Asian Federation, showing significant portions of these
rivers as dominated by Altaic speakers, even though Russia speakers have engulfed many of
them); id. at 826 (Map of Eastern Russian Federation, showing almost all of the major rivers a
dominated by Altaic speakers, although Russian speakers have begun to dominate some regions
in the southeast along the coast).
162
This wave of migrations into Oceania began around 3500 BC. See Malcolm Ross,
Clues to the Linguistic Situation in Near Oceania Before Agriculture, in THE LANGUAGES OF
HUNTER-GATHERERS: GLOBAL AND HISTORICAL PERSPECTIVES (forthcoming) (manuscript at 4).
163
See, e.g., ETHNOLOGUE, supra note 12, at 782 (Taiwan still dominated by Austronesian
languages in the east, despite Chinese invasions and Chinese rule); id. at 786 (showing
Austronesian languages as dominating most of Indonesia); id. at 808-811 (showing many parts
of Malaysia as dominated by Austronesian languages); id. at 819-822 (showing the Philippines
as dominated by Austronesian speakers); id. at 854-882 (showing Austronesian languages as
still dominating most of the Pacific Islands).
164
At around 9000 BC, the descendants of one of these earliest hunter-gatherer groups to
have migrated into Near Oceania (around 40,000 BC), which spoke early languages of the
Trans New Guinea continuum, learned to grow taro and banana (thereby making a first
transition to a primitive form of agricultural subsistence), and then spread through large parts of
Near Oceania (but not northwest Melanesia) by about 6000 BC. Id. (manuscript at 2-3).

  47
languages still show up in significant numbers in many of the regions where
they first began to develop these more minor agricultural activities.165
Turning to the Austro-Asiatic language family, this language family
appears to have first expanded primarily from the Salween, the Mekong and
the Irrawaddy rivers in Southeast Asia.166 The most significant displacement
of these languages then occurred when Tai-Kadai speakers moved from
southern China into some of the regions between the Salween and Mekong.167
Various Austro-Asiatic speaking groups have also come under foreign rule in
the historical period by, for example, both the French and the Chinese, and yet
none of these events has altered the basic predominance of Austro-Asiatic
languages along the outer banks of the Mekong and Salween.168 Nor has any
been able to eliminate significant pockets of Austro-Asiatic languages
between these two rivers.169 The Tai-Kadai language family has, for its part,
remained linguistically persistent for as long as they have been in this
region.170 The Tai-Kadai language family appears to have begun to expand
somewhat earlier, however, around some smaller rivers in southeastern
China.171 Some of these groups appear to have been partly displaced by Sino-
Tibetan speakers, but—importantly—there are still significant pockets of Tai-
Kadai speakers in these previous locales.172
I have already discussed the Dravidian language family above. This
language family first shows up in the historical record around the Krishna, the
Godavari and the Kaveri rivers, and it has resisted replacement by Indo-
                                                                                                               
165
See, e.g., ETHNOLOGUE, supra note 12, at 784 (showing Trans-New Guinea languages
as dominated most of the east and some other parts of East Timor); id. at 786 (showing Trans-
New Guinea languages dominating Eastern Papua); id. at 866-878 (showing Trans-New Guinea
languages as dominating many parts of Papua New Guinea, despite some encroachments from
Austronesian speakers from the West).
166
THE CAMBRIDGE HISTORY OF SOUTHEAST ASIA vol. 1, at 109 (Nicholas Tarling, ed.)
(1999).
167
Id. (“[I]t is probable that many of the prehistoric sites of northeast Thailand . . . were
inhabited by speakers of Austroasiatic languages which eventually fell victim to the
assimilating tendencies of the historical Thai kingdoms after the thirteenth century CE.”) (“It is
quite apparent that a once-continuous distribution of Austroasiatic languages over most of
Southeast Asia, and even areas in the Nicobars and possibly northern Sumatra, has been broken
up by the historical expansions of the Chinese, Tai, Vietnamese, Burman and Austronesian
(Malay and Cham) peoples.”).
168
See, e.g., ETHNOLOGUE, supra note 12, at 833 (Map of Vietnam, showing Austro-
Asiatic speakers as dominating most regions to the East of the Mekong, with some incursions
by Tai-Kadai speakers in the far North and Austronesian speakers in the far Southeast); id. at
778 (Map of Cambodia, showing Austro-Asiatic speakers as dominating most of the country
along the Mekong); id. at 829-831 (Maps of Thailand, showing Austro-Asiatic speakers to the
West of the Salween); id. at 806 (Map of Laos, showing significant regions along the Mekong
as dominated by Austro-Asiatic speakers, even although Tai-Kadai speakers currently dominate
many riverfront regions).
169
Id. at 829-831 (Maps of Thailand, showing significant pockets of Austro-Asiatic
speakers despite large-scale encroachments by Tai-Kadai speakers).
170
Id. (Map of Thailand, showing Tai-Kadai speakers as dominating in these regions).
171
Id. at 110 (“The distribution of this language family allows one to infer an original
zone in southeastern China; it was presumably once much more widespread
172
Id. at 780-81 (Maps of Southwestern and Southern China, showing a number of Tai-
Kadai languages—such as Ai-Cham, Biao, Cao Miao, Chadong, Northern and Southern Dong,
Kang, Mak, Maonan, Mulam, Sui, T’en and Lakkia—remaining in the southern parts of China);
id. at 781 (Map of Southern China, showing a number of Tai-Kadai languages—such as
Lingao, Cun, Ge-Yang (Gelao), Hlai, and Jimao—remaining on the Hainan island of China).

  48
European languages from the North ever since.173 The final language family
is Nilo-Saharan, which is one of the smallest (second only to Uralic). Our
historical records are more sparse with regard to this last language family but
it is significant that they first show up in the historical record along the upper
Nile, where they still appear in very significant numbers.174
We have now taken a comprehensive look at the historical record from
around the world and over the course of all of world history and the results are
quite striking: there is not one known case in this entire lengthy record in
which a major language family that has reached equilibrium around a major
riverine topography in accordance with the riverine-agricultural model of
linguistic expansion has been completely replaced by a foreign language
family through conversion. These facts provide strong empirical support for
the significant pocket criterion.

d. On the Irrelevance of Certain Well-Known Processes of Linguistic


Replacement by Rome, European Colonialism and the Steppe
Nomads

Given the historical evidence discussed above, one might wonder why
the orthodox view of Indo-European prehistory, which presupposes a process
of complete linguistic replacement in northwestern India that has no known
analogue in world history, might have seemed so plausible to so many people
for so long. Part of the answer to this question surely lies in the fact that we
have not previously been in possession of the riverine-agricultural model of
linguistic expansion. Hence, its role in producing especially persistent
linguistic phenomena could not have even been hypothesized let alone tested.
Another important part of the answer likely lies in a different fact: we are
all familiar with several very important processes of linguistic replacement,
which have left a particularly striking impression on the modern imagination,
and which can seem to lend more credibility than they should to the type of
linguistic replacement that the traditional view is presupposing. Here I am
thinking of (1) the early expansions of Latin that accompanied the rise of the
Roman Empire and ultimately produced a wider array of romance languages
in parts of Europe; (2) the more recent displacement of many of the
indigenous languages of the Americas, Australia and New Zealand by
European languages during the period of European colonialism; and (3) the
long series of linguistic displacements that were generated in and around the
steppes by the Turkish and Mongolian nomads who were able to conquer so
many parts of Eurasia beginning in about the 6th century AD. I therefore want
to address these three phenomena and explain why I do not believe they raise
genuine counterexamples to the earlier findings of this article.
Beginning with the Roman Empire, it is not uncommon to picture this
Empire as having produced a fairly straightforward and large-scale
replacement of many different languages with Latin, but a closer look at the

                                                                                                               
173
See Figure 5.
174
See, e.g., ETHNOLOGUE, supra note 12, at 728-729 (Maps of Sudan, showing a broad
array of Nilo-Saharan languages distributed through the Southern regions, even though the
official language of Sudan is standard Arabic).

  49
facts reveals a more interesting and nuanced pattern. At its height, the Roman
Empire spanned an enormous region centered around the Mediterranean, and
the Romans ruled over groups who spoke quite a number of different non-
Latin languages.175 These other groups included Greeks; numerous Afro-
Asiatic speaking groups in Egypt, the Levant and Northern Africa; a range of
Celtic speaking groups in western Europe, mostly south of the Danube and
west of the Rhone; Albanian speaking groups on the coast of the Adriatic Sea;
a number of primarily Iberian groups who spoke pre-Celtic languages; and
(though to a much lesser degree) some Germanic speaking groups as well.176
Of these many different groups, it was the Celts who spoke the most
closely related Indo-European languages to Latin, and some have even
suggested that the Celtic languages and Latin are part of a single “Celtic-
Italic” branch of this language family.177 The ancient Celtic languages of the
time would have also been much closer to Vulgar Latin (which is what
ultimately replaced many of them) than modern Celtic languages are today to
either modern Italian or classical Latin.178 As in the case of the replacement
of Egyptian with Egyptian Arabic, these facts would have therefore greatly
aided with processes of mutual intelligibility and linguistic conversion. In
fact, the French language—which is derived from Vulgar Latin—has some
ways remained phonetically closer to the pre-existing Celtic languages of
northern France than to Vulgar Latin.179
It is therefore highly significant that, of all the linguistic groups that it
conquered, the Roman Empire was only able to displace a range of Celtic
languages and some of the less unified pre-Celtic languages (mostly in Iberia)
through linguistic conversion. Roman rule did not, on the other hand, lead to
the displacement of Afro-Asiatic languages in the Levant, in Egypt or in any
other part of northern Africa.180 Nor did it lead to the displacement of Greek,
Albanian, or Germanic languages, in regions where those more highly
differentiated Indo-European languages had been dominant.181 A closer look
at the patterns of Roman linguistic replacement therefore suggest that these
patterns were limited to cases where either (1) the pre-existing language was a
closely related member of the same branch of the same Indo-European
language family or (2) the pre-existing language was not part of any major

                                                                                                               
175
See BARRY CUNLIFFE, EUROPE BETWEEN THE OCEANS: 9000 BC—AD 1000, at 407
(Yale University Press, 2008).
176
Id. at 364-406.
177
See, e.g., THE ENCYCLOPEDIA OF INDO-EUROPEAN CULTURE, supra note 122, at 100
(“[M]ost attempts to reconstruct the interrelationships of the IE languages tend to group Italic
and Celtic closer to one another than to most other IE languages and there are some today who
wish to resurrect some form of the Italo-Celtic hypothesis.”).
178
STEVEN ROGER FISCHER, HISTORY OF LANGUAGE 120-21 (1999) (describing
replacement of older Celtic languages by Vulgar Latin in northern France, and the emergence
of French from this process).
179
Id. at 121 (“French emerged from Vulgar Latin on a Gaulish substrate, retaining several
Celtic pronunciations . . . .”).
180
See, e.g., ETHNOLOGUE, supra note 12, at 693 (Linguistic Maps of Egypt and Libya,
showing that Afro-Asiatic languages still predominate in the region); id. at 803 (Linguistic
Maps of Jordan and Syria, showing that Afro-Asiatic languages still predominate in the region);
181
Id. at 843 (Linguistic Maps of Greece and Macedonia); id. at 533 (Turkey, showing a
broad range of pre-existing non-romance languages predominating in the region); id. at 545
(Albania, showing that Albanian language still predominate in this region).

  50
linguistic phenomenon that had emerged in accordance with the riverine-
agricultural model of linguistic expansion. Hence, nowhere do these patterns
show the kind of rapid and complete replacement of one major linguistic
phenomenon by another that is being presupposed by the Indo-Aryan invasion
or migration theorist.
Turning to the New World, it is sometimes hard to reconstruct the pre-
Columbian linguistic situation with confidence. 182 What we do know,
however, is the following. There appear to have been two main sources for
the early development of agriculture in the New World. The first was in
Central America, and these developments ultimately led to the emergence of
the Mayan Empire—which spoke languages in the Mayan family.183 The
second main source of agricultural production was centered in the
northwestern parts of South America (including most of modern day Peru).184
These developments led to the emergence of the Inca Empire, which spoke
Quechuan languages.185 The Mayans were ultimately superseded by foreign-
speaking Aztecs in the pre-Columbian period, and the Aztecs were later
conquered by foreign speaking Spaniards. 186 Still, there are significant
pockets of native Mayan speakers in Central America today. Indeed, this
language family presently has at least 6 million native speakers, and is
currently one of the two largest Native American language families.187 With
regard to the Incas, these people were also conquered by foreign speaking
Spaniards, but—once again—there are still significant pockets of Quechuan
speakers in these former Inca regions. 188 Quechuan is currently has
approximately 10 million native speakers, and it is the other of the two largest
Native American language family today.189 These facts thus further support
the view that linguistic persistence should have been expected in the Indus
Valley.
There are no other parts of Australia, New Zealand, or the Americas that
show the same kinds of developments in social complexity that we see with
the Mayas, the Aztecs or the Incas, and—at the point of first modern
European contact—many of these regions were inhabited either by primarily
nomadic groups or groups who practiced much more minimal forms of

                                                                                                               
182
See LYLE CAMPBELL, AMERICAN INDIAN LANGUAGES: THE HISTORICAL LINGUISTICS OF
NATIVE AMERICA 4 (1997) (“It is often assumed that masses of [Native American] languages
have disappeared without a trace, and indeed many have become extinct since European
contact . . . .” (citation omitted)); id. at 170 (describing special difficulties in reconstructing the
pre-Columbian linguistic situation in South America).
183
See MARCEL MAZOYER & LAURENCE ROUDART, A HISTORY OF WORLD AGRICULTURE:
FROM THE NEOLITHIC AGE TO THE CURRENT CRISIS 74-75 (2006) (listing Central America as
one of the four main expanding centers of Neolithic agricultural transitions); OXFORD
COMPANION TO ARCHAEOLOGY 406 (Brian M. Fagan ed., 1997) (describing development of
Mayan Civilization from early agricultural settlements in Central America).
184
See MAZOYER & ROUDART, supra note 183, at 74, 189-216 (discussing the Incan
Agrarian system, and listing this as an additional but largely non-expanding center for the
development of agriculture).
185
Id.
186
J.M. ROBERTs, supra note 117, at 485.
187
ETHNOLOGUE, supra note 12, tbl. 5 at 30 (listing 6,038,182 native Mayan speakers).
188
See id. at 741, 748-50, 762-63.
189
Id. tbl. 5, at 31 (listing 10,127,900 native Quechuan speakers).

  51
agriculture.190 Hence, there are no other regions that are historically attested
to have produced linguistic phenomena that are directly relevant to our
present analysis. Still, this fact may be the product of our relatively poor
historical records relating to these regions. It is therefore worth noting that,
while European languages have undoubtedly displaced a large number of
indigenous languages in many of these other regions, these linguistic
replacements have most often taken place—tragically enough—as parts of
larger processes that has included the wholesale or near wholesale
replacements of pre-existing populations. Even given this tragic history, there
are, moreover, still significant pockets of many of these other indigenous
languages in many of these other regions.
Stepping back, what we can therefore safely say about the New World is
the following: there is not one historically recorded case in these regions in
which a foreign group has been able to completely replace through conversion
a major language family that has been produced and reached equilibrium in
accordance with the riverine-agricultural model of linguistic expansion.
Although this fact may reflect a weakness in our historical records, we can
also say that there is not one case in which the replacement of a major
linguistic phenomenon could have have taken place without the near complete
replacement of pre-existing populations, and there is a good chance that the
language family in question still remains in significant pockets.191 The Indo-
Aryan invasion or migration theorist no longer claims a process of near
complete population replacement, however, and so there is nothing in the
evidence from the New World that will support the Indo-Aryan invasion or
migration theorist’s assumptions.
The final case that I want to discuss relates to the well-known series of
Turkish and Mongolian nomads who began to sweep through the Eurasian
steppes beginning in about the 6th century AD and were ultimately able to
conquer large parts of Eurasia. The record of conquests by these groups is
nothing less than spectacular, and these events generated a major expansion of

                                                                                                               
190
The closest exception would appear to be the development of certain agricultural
societies around the Mississippi. See THE OXFORD COMPANION TO ARCHAEOLOGY, supra note
183, at 475-76 (entry on “Mississippian Culture”). The archaeological record suggests that,
from about 1000 to 1540 AD, a number of sedentary farming groups emerged in the interior
riverine regions in eastern North America, primarily around the Mississippi River and its
tributaries. Id. Some of these settlements developed notable forms of social complexity,
though most were relatively short-lived, and none produced any urban civilizations comparable
to the Incas or Mayans. Id. In addition, although “when Hernandez de Soto first entered the
Mississippi Valley [and encountered some of these groups], his chroniclers described highly
populated centers,” [w]hen French explorers entered the Mississippi Valley some 100 years
later, the region was virtually vacant.” Id. at 476. The most common explanation for this is
epidemiological: “Like all Native Americans, Mississippian people had no immunity to
European infectious diseases. Because they were living in such high population densities, they
were particularly vulnerable to recurring epidemics.” Id.
191
See ETHNOLOGUE, supra note 12, tbl. 5 at 28-32 (listing a broad range of Native
American language families that still exist in significant pockets in the Americas, including
Algic, Arauan, Araucanian, Arawakan, Aymaran, Barbacoan, Chibchan, Choco, Eskimo-Aleut,
Guahiban, Huavean, Iroquoian, Jivaroan, Keres, Kiowa Tanoan, Lakes Plain, Macro-Ge,
Mascoian, Mataco-Guaicuru, Misumalpan, Mixe-Zoque, Muskogean, Na-Dene, Oto-
Manguean, Panoan, Peba-Yaguan, Penutian, Salishan, Salivan, Siouan, Tacanan, Tarascan,
Tequistlatecan, Totonacan, Tucanoan, Tupi, Uto-Aztecan, Witotoan and Yanomam)

  52
Altaic languages throughout many parts of Eurasia.192 These events also
undoubtedly led to the displacement of many pre-existing languages from
many regions, and so Indo-Aryan invasion or migration theorists might try to
look to these events for a model of the type of linguistic replacement they are
assuming.
Here too, however, a closer look at the record reveals a more complex
and nuanced pattern of linguistic replacement. Although Altaic speaking
groups have sometimes been able to completely displace the pre-existing
languages in certain sparsely-populated regions like the Eurasian steppes,193
these groups have almost always ultimately adopted the languages (and many
of the social and cultural traditions) of the more populous and settled
agricultural groups that they have conquered. For example, the Altaic rulers
of Iran all eventually became Persianized and left the pre-existing Iranian
languages dominant in the region;194 the Altaic rulers of India all eventually
became Aryanized and left the pre-existing Indo-Aryan languages dominant
in the region;195 the Altaic rulers of China all became Sinicized and left the
pre-existing Sino-Tibetan languages dominant in the region;196 and most of
the Altaic rulers of Egypt and Mesopotamia ended up becoming Arabicized
and all left the pre-existing Afro-Asiatic languages dominant in these
regions.197
The one closest exception is Turkey, where Turkish (rather than one of
the pre-existing major languages) is now the dominant language spoken in the
region owing to the reign of the Ottomans.198 Even here, however, Turkish
co-exists with a number of pre-existing major language families from the
region (including Greek and Arabic), and has not altogether displaced them.199
Hence, these well-known historical events cannot be used as a model for the
type of linguistic replacement that the Indo-Aryan invasion or migration
theorist is presupposing.
We can now see that, in fact, the entire world historical record speaks
against the type of rapid and complete linguistic replacement by conversion
that the Indo-Aryan invasion or migration theorist is presupposing. The
record shows that the major linguistic phenomena that have reached
                                                                                                               
192
For a thorough description of the Turkic and Mongolian invasions that created these
expansions of Altaic languages, see generally GROUSSET, supra note 38.
193
ETHNOLOGUE, supra note 12, 774-75 (Linguistic Maps of Afghanistan and Azerbaijan,
showing predominance of Altaic languages only in Azerbaijan and the far northern regions of
Afghanistan, 804-804 (Linguistic Maps of Kazakhstan, Kyrgyzstan), 824-26 (Linguistic Maps
of Asian Russian Federation, showing Altaic linguistic predominance primarily in Siberia and
far northern regions that are very sparsely populated), 832 (Linguistic Maps of Turkmenistan
and Uzbekistan), 848-49 (Linguistic Map of European Russian Federation, showing Altaic
predominance only in some small regions near the upper Volga).
194
GROUSSET, supra note 38, at xxviii (“[T]he process of Islamization and Iranization
among the Turkish conquerors of Iran and Anatolia forms an exact counterpart to the Sinicizing
noted among the Turkic, Mongol, or Tungus conquerors of the Celestial Empire.”).
195
J.M. ROBERTS, supra note 117, at 429.
196
GROUSSET, supra note 38, at xxviii.
197
J.M. ROBERTS, supra note 117, at 317-32, 372-92; ETHNOLOGUE, supra note 12, at 693
(Linguistic Maps of Egypt and Libya, showing continued predominance of Afro-Asiatic
languages), 803 (Syria and Jordan, showing continued predominance of Afro-Asiatic
languages), 457-61 (showing Iraq and Israel as still dominated by Afro-Asiatic languages).
198
ETHNOLOGUE, supra note 12, at 533-35, 577-78.
199
Id.

  53
equilibrium in accordance with the riverine-agricultural model of linguistic
expansion have never once in known history been completely replaced by a
distinct major language family through a process of linguistic conversion in
one of their original regions of major riverine expansion. This is true even
though the world historical record also contains numerous examples of
invasions by foreign groups in these same regions.

e. A Final Aside on the Somewhat Singular Case of Old Europe

Thus far, I have been limiting my attention to the historical record.


Before continuing, I would nevertheless like to acknowledge that there may
be one single unrecorded event in prehistory that—depending on how one
interprets it—might provide a single counterexample to the patterns of
linguistic persistence currently under discussion. Here, I am speaking of the
destruction of the early settlements of Old Europe, which first began to
emerge around the lower Danube and Dnieper rivers, but were decimated by
about 3500 BC.200 If we assume that the people of Old Europe spoke non-
Indo-European languages, then the subsequent immigration of Celtic-Italic
groups into these regions beginning in around 3300 BC might be construed as
a case in which the language (or languages) of Old Europe had been
developing around a major riverine topography but were completely replaced
by a foreign language family. As the reader will remember, Figure 3-B
depicts these relevant events, and contrasts them with contemporaneous
events in the Indus Valley.
Even if we were to assume that the people of Old Europe spoke non-
Indo-European languages, the radically different archaeological records in the
Danube-Dnieper and Indus-Sarasvati Valleys during in these early periods (as
discussed above and depicted in Figure 3-B) suggests that this case cannot
actually lend much support to the Indo-Aryan invasion or migration theorist’s
assumptions. The settlements of Old Europe had only begun to emerge into
something that might have represented an incipient major language family
when, beginning in 4000 BC and then ending in 3500 BC, these settlements
were all destroyed.201 At most, this case was therefore one in which Celtic-
Italic speaking groups replaced a relatively minor linguistic phenomenon
through a process of near complete population replacement. For reasons
already discussed, processes like these are very unlike the one the Indo-Aryan
invasion or migration theorist is assuming.
In addition, there is another way to interpret the prehistoric events in
Old Europe, which would render them perfectly consistent with the world
historical record from every other relevant region and time period. In
particular, one might propose that certain western branches of Indo-Hittite
speaking groups split off from their Proto-Indo-European brethren to the east
and began to move into Anatolia and the Balkans at a very early point in time.
These groups would have then formed the basis for the Anatolian branch of
the Indo-European language family—and perhaps some other distinctive

                                                                                                               
200
See David W. Anthony, The Rise and Fall of Old Europe, in THE LOST WORLD OF OLD
EUROPE: THE DANUBE VALLEY, 5000–3500 BC, at 29, 35–54 (David W. Anthony & Jennifer Y.
Chi eds., 2010).
201
Id.

  54
branches. If so, then the inhabitants of Old Europe may have spoken
languages that were not so distant from the early Proto-Indo-European
languages that began to replace them in about 3300 BC. On this view, these
new Indo-European groups would have therefore been displacing a set of
closely related languages that were part of the same major language family,
and in a process would have resembled the replacement of both ancient
Egyptian with Egyptian Arabic and Celtic with Vulgar Latin.
One benefit of this latter interpretation is that it would help to explain
why not only Anatolian languages but also a number of other seemingly
distinctive (but extinct) Indo-European languages have been found in and
around the Balkans and Anatolia during ancient times.202 A second benefit is
that it would render the prehistoric events in Old Europe consistent with the
very uniform patterns of linguistic replacement that we have seen in every
other part of the world and in every other portion of world history. I therefore
favor this latter interpretation, but—even if it were rejected—this single case
does not provide very much real support for the plausibility of the Indo-Aryan
invasion or migration theorist’s assumptions for all of the other reasons
described above. This is especially true when one remembers just how minor
this single (ambiguous, unrecorded and prehistoric) case is in comparison to
the extensive historical record discussed in this section; and how minor a
social phenomenon Old Europe was in comparison to the Indus Valley
Civilization.

4. Telling a More Complete Story about the Indo-European Prehistory

On the view I have been arguing for here, the Indus Valley Civilization
should therefore be understood as having spoken dialects of Proto-Indo-
European that are directly ancestral to the Indo-Iranian languages that
currently dominate the northern parts of the Indian subcontinent. The Indo-
Iranian branch of Indo-European is, however, only one of its many branches,
and historical linguists have been able to reconstruct what they believe to be
the most plausible phylogenetic relations between this branch and the rest.
These relations have a well-known tree-like, branching structure, which is
represented in Figure 7. Because of this tree-like structure, claims about
specific branches of the Indo-European language family will have
implications for the rest. In order to help visualize some of these
implications, I have therefore integrated the claim that Proto-Indo-European
dialects were spoken in the Eastern-Iran-Bactria-Indus-Valley region into this
same diagram.

                                                                                                               
202
See, e.g., THE ENCYCLOPEDIA OF INDO-EUROPEAN CULTURE, supra note 122, at 145-46
(Dacian), 287-89 (Illyrian), 361 (Macedonian), 378-79 (Messapic), 418-20 (Phrygian), 423-24
(Picene), 575-77 (Thracian), 620-22 (Venetic).

  55
Given these facts, one of the most common sources of resistance to the
possibility that the Indus Valley Civilization might have spoken dialects of
Proto-Indo-European arises from the perceived implications of this claim for
various other branches of this family. Put most simply, the worry is that
acceptance of this claim will bring with it certain necessary implications about
Indo-European prehistory that will prove difficult to square with an extensive
body of evidence relevant to this larger topic. In this section, I would
therefore like to address this common worry.
The most direct way to respond to a worry like this would be to
articulate a broader story about Indo-European prehistory, which not only
contains the claim that the Indus Valley Civilization spoke dialects of Proto-
Indo-European but is also consistent with the much broader body of evidence
relevant to this larger topic. In my recent work, I have been developing just
such a story. I have also been arguing, at some length, that this new story is
actually better able to explain (or at least render coherent) this larger body of
evidence than the current alternatives.203 In what follows, I will therefore
present a brief sketch of this new story, and describe some of its main

                                                                                                               
203
See Robin Bradley Kar, On the Early Eastern Origins of Western Law and Western
Civilization: New Arguments for a Changed Understanding of Our Earliest Legal and Cultural
Origins, University of Illinois Law Review vol. 2012 (forthcoming 2012), available at
http://ssrn.com/abstract=2039500 (Part 1), http://ssrn.com/abstract=2039502 (Part 2),
http://ssrn.com/abstract=2039504 (Part 3).

  56
explanatory advantages and potential. These arguments will reveal that, once
embedded in the right narrative, the main linguistic claim of this article can be
understood as the beneficiary of a much broader and more extensive form of
evidentiary support than has been discussed thus far.
I have broken this new story into four basic stages, which I call the
“Primal Age,” the “Age of Expansion,” the “Age of Dissolution,” and the
“Historical Age,” respectively.

a. The Primal Age (75,000 BC – 3500 BC)

The Primal Age begins with the rise of behaviorally modern humans
(most likely in east Africa in or around 75,000 BC)204 and then lasts until
about 3500 BC. For the greater part of this long period, human beings had not
yet developed agriculture, and our best evidence suggests that humans tended
to live in relatively small, nomadic social formations, with population
densities that were very small.205 These conditions tend to produce extreme
linguistic diversity,206 and we should therefore expect that for most of the
Primal Age, all humans would have spoken languages that were relatively
minor in scale and would have tended to diverge from others outside a fairly
small geographic area over time. As a result, there would have been a great
many language families, no one of which would have represented a very
sizeable percentage of the world’s population. The groups who spoke
languages directly ancestral to Proto-Indo-European would have fallen into
this same category, and we should therefore expect that these languages—like
all others—would have represented only minor linguistic phenomena for most
of this time.
Shortly after the first development of agriculture in the Fertile Crescent,
however, some groups who spoke languages directly ancestral to early Proto-
Indo-European began to absorb these agricultural technologies—on the
present view.207 This most likely took place through an intermediary, and it
was from this intermediary that the earliest Indo-European groups would have
absorbed both a great number of terms for agricultural technologies and some

                                                                                                               
204
Ian Tattersall, Human Origins: Out of Africa, PROCEEDINGS OF THE NATIONAL
ACADEMY OF SCIENCES USA vol. 106, at 16020 (2009).
205
Adam Powell, Late Pleistocene Demography and the Appearance of Modern Human
Behavior, SCIENCE vol. 324, at 1298 (2009).
206
Phonemic Diversity Supports a Serial Founder Effect Model of Language Expansion
from Africa, SCIENCE vol. 332, at 346 (2011).
207
I say this in part because the first agricultural activities that spread into the Indus
Valley from Mehrgahr appear to have represented an early eastward expansion of practices
developed in the Fertile Crescent into the region. ALLCHIN & ALLCHIN, supra note 3, at 137
(“We should not forget that the spread of agriculture we are discussing is essentially the spread
of the highly successful pattern of wheat and barley, cattle, sheep, and goat, which we have
seen emerge in the piedmont zone at Mehrgarh; and that this pattern appears to have been
underlying the whole process of expansion, leading up to the emergence of the Mature
Harappan civilization.”); ENCYCLOPEDIA OF HUMAN EVOLUTION AND PREHISTORY 98 (Eric
Delson et al. eds., 2000) (mentioning wheat and barley as the primary plants domesticated in
the Fertile Crescent region, and also mentioning the domestication of the sheep, goat, and
cattle).

  57
other Semitic and Sumerian loan words.208 Joanna Nichols has presented
linguistic evidence to suggest that these people were most plausibly located
just south of the Caspian Sea at this early time,209 and I have followed her
suggestion in this regard.
Some of these pre-Proto-Indo-European speaking groups would have
then begun to migrate westwards through Anatolia and into the Balkans, on
the present view, where they would have formed the basis for the Anatolian
branch of the Indo-European language family—and possibly some other
related branches that are now extinct. (I have discussed some of my reasons
for this belief at the end of the last section.) Other members of these Pre-
Proto-Indo-European speaking groups would have then begun to move
eastwards into Bactria, eastern Iran, Mehrgahr, and, ultimately, into the Indus
Valley—where they would have formed the basis for the Proto-Indo-
European dialects.210
Once agriculture began to develop around the major riverine systems
that connect the larger Eastern-Iran-Bactria-Indus-Valley region, this entire
region would have then begun to turn these early Proto-Indo-European
dialects into a much larger and more highly coordinated linguistic
phenomenon. Beginning in about 4500 BC, and lasting until about 1900 BC,
these river systems would have thus begun to play the most central, the most
significant, and the most enduring focal point for the prehistoric coordination
and expansion of the Indo-European language family (and, incidentally, for
the development of several key Indo-European cultural innovations that have
made subsequent Indo-European groups particularly well adapted to
transitioning into and sustaining large-scale societies with the rule of law).
These predictions were depicted in Figure 4 above.
In the period leading up to 3500 BC, the people of the Indus Valley
Civilization had not yet entered into its first period of incipient urbanism.211
Even at this early time, however, the riverine-agricultural model of linguistic
expansion predicts that Proto-Indo-European speaking groups would have
begun to exhibit a range of different material cultures and forms of life.
Some, for example, would have almost certainly begun to engage in primarily
pastoralist and more nomadic forms of subsistence—particularly in certain
hilly regions like the Hindu Kush and in the regions bordering (or entering
into) the Eurasian steppes and the central Iranian plateaus. Other Proto-Indo-
European speaking groups would have begun to engage in more sedentary and
agricultural forms of life—especially closer to the major riverine centers of
these expanding socio-cultural complexes.

                                                                                                               
208
See Joanna Nichols, The Eurasian Spread Zone and the Indo-European Dispersal, in
ARCHAEOLOGY AND LANGUAGE II: CORRELATING ARCHAEOLOGICAL AND LINGUISTIC
HYPOTHESES (1998) (ed. Blench, R. and M. Spriggs), 220-266.
209
Id.
210
The archaeological record suggest that some seasonal agricultural activities began in
Mehrgahr by around 7000 BC, but that there was a biological break in the record around 4500
BC—after which time more persistent agricultural activities began to spread from Mehrgahr
through the Sindhu-Sarasvati Valley. See ALLCHIN & ALLCHIN, supra note 3, at 113–83. I
associate these groups with Proto-Indo-European speakers, though the earlier groups in
Mehrgahr may have spoken other languages.
211
Territorial expansion and primary state formation, PROCEEDINGS OF THE NATIONAL
ACADEMY OF SCIENCES USA vol. 107, at 7124 (2010).

  58
Because there is a reconstructible Proto-Indo-European term for “horse,”
many have thought to locate the Proto-Indo-European homeland in the
steppes, where the horse was first domesticated.212 During this early period,
however, a proto-Indo-European term for “horse” would have plausibly
spread throughout the larger socio-cultural complex that was developing in
the Eastern-Iran-Bactria-Indus Valley region, because the Proto-Indo-
European groups who lived directly adjacent to the steppes (such as in ancient
Bactria) would have been familiar with the animal. On the present view, it
was, moreover, during this very early period that some of the groups near
Bactria would have begun to separate and migrate to the northeast along the
Tien Shan mountain range, where they would have formed the basis for the
Tocharian branch. On the present view (unlike many others), the far eastern
location of the Tocharians is therefore easy to understand. The Primal Age is
depicted in Figure 8 below.

b. The Age of Expansion (3500 BC to 1900 BC)

The second major stage—or the “Age of Expansion”—began in around


3500 BC and lasted until about 1900 BC. There are three reasons why this
period differed from the last one. First, in about 3500 BC, the Indus Valley
Civilization began to enter into its first period of incipient urbanism, which
would have greatly increased its political, economic, cultural and linguistic
influence within the larger Eastern-Iran-Bactria-Indus-Valley region. 213

                                                                                                               
212
ANTHONY, supra note 22, at 91.
213
ALLCHIN & ALLCHIN, supra note 3, at 113-83.

  59
Second, by 3500 BC, the early proto-urban settlements of Old Europe around
the lower Danube and Dnieper had all been destroyed.214 Third, in around
3400 BC, the wheeled wagon first begins to show up robustly in the
archaeological record of the steppes.215 Before the invention of the wheeled
wagon, the steppes would have been very sparsely populated, and would have
almost certainly displayed a very great amount of linguistic diversity. 216
Wheeled wagons can, however, be pulled by horses (which had been
domesticated in the steppes a bit earlier—probably in about 4200 BC), and so
the invention of the wheeled wagon meant that much larger groups (including
families) could begin to migrate through the steppes and engage in forms of
pastoralism that extended much further away from major riverine valleys.217
These developments essentially connected up the western parts of the steppes
(viz., in Eastern Europe near the mouth of the Danube and Dnieper rivers)
with those portions of the steppes that are directly adjacent to Bactria—thus
opening up the “northern route” for migrations between the Indus-Sarasvati
and Danube-Dnieper Valleys that was depicted in Figure 3-B. From this time
on, the steppes were no longer sparsely populated regions that could only be
traversed by horseback. They instead became potential conveyor belts for
larger-scale pastoralist populations and migrations.
I call this second age the “Age of Expansion,” because it also marked the
beginning of the period when the steppes first transformed into what Joanna
Nichols has called a “linguistic spread zone.”218 Linguistic spread zones are
special regions, where many populations interact over extremely wide
geographic ranges219 and which tend to produce a succession of equilibrium
languages, each of which replaces its predecessor as it begins to reach
equilibrium in the region.220 The languages spoken in linguistic spread zones
therefore tend to have shallow historical roots and display very little structural
diversity.221 At any given time, these equilibrium languages also tend to
follow some “center of cultural, political, and/or economic influence,”222 and
these languages often serve “as a lingua franca for the entire area or a large

                                                                                                               
214
Anthony, supra note 200, at 45–51 (discussing first wave of destruction that occurred
around 4300–4100 BC); id. at 51–53 (discussing gradual abandonment between 4000–3500
BC).
215
ANTHONY, supra note 22, at 74–75 (“Regardless of where the wheel-and-axle principle
was invented, the technology spread rapidly over much of Europe and the Near East between
3400 and 3000 BCE. Proto-Indo-European speakers talked about wagons and wheels using
their own words, created from Indo-European roots.”).
216
Id. at 282–85 (discussing early trade in Steppe culture); see also id. at 336 (“The
Yamnaya horizon developed in the . . . steppes largely because an innovation in land transport,
wagons, was added to horseback riding to make a new kind of herding economy possible. At
the same time, an innovation in sea transport . . . probably was responsible for the . . . initial
development of the northwest Anatolian trading communities.”).
217
Id. at 300
218
NICHOLS, supra note 11, at 22-24.
219
Id. at 16–17.
220
Id. at 17–20.
221
Id. at 17.
222
Id.

  60
part of it.”223 Still, these centers of influence “may shift as political and
economic fortunes shift.”224
It should be clear from this description that the major riverine
topographies that have helped produce the world’s major language families
are not linguistic spread zones in Nichols’s sense, because all of the major
languages families that have emerged from these regions have proven
extraordinarily resistant to replacement. The Eurasian Steppes, on the other
hand, have functioned as a paradigmatic linguistic spread zone for
millennia. 225 Nichols notes that: “[t]hroughout the entirety of traceable
linguistic history . . . all or most of the steppe has been dominated by a single
language family, and often a single language has covered most of it.”226 She
observes that Iranian languages dominated the Steppes for approximately two
millennia beginning in about 2000 BC; Turkic languages then succeeded
Iranian languages in the Steppes beginning with the Turkish invasions of the
6th century AD; and Mongolian languages then succeeded Turkic languages
in the Steppes beginning in early medieval times.227 Hence, the historical
record from the Steppes shows that, periodically, “a new linguistic group
sweeps westward from the vicinity of Mongolia, rapidly attains military and
cultural hegemony on the Steppe (and simultaneously also in the deserts of
Central Asia and the plains of northern Mesopotamia and Anatolia), and
replaces the previous language or language famil[ies]” in these regions.228
When the Eurasian steppes first began to transform into a linguistic
spread zone, there was, however, only one major center of political and
economic power adjacent to the relevant steppe regions (i.e., west of the Altai
mountains). This was the major socio-cultural complex that had been
expanding, on the present view, to produce a highly coordinated set of
dialects throughout the larger Eastern-Iran-Bactria-Indus-Valley region. As I
have said earlier, I associate this language family with Proto-Indo-European,
and, beginning in about 3,300 BC, I therefore believe that Proto-Indo-
European languages most plausibly began to spread through these western
steppe regions. These developments are depicted in Figure 9.
From this period on (until about 1900 BC), we should think of this
newly combined region—which includes not only the steppes west of the
Altai mountains but also the Eastern-Iran-Bactria-Indus-Valley region—as
containing a closely related set of Indo-European dialects that were
nevertheless spoken by groups that displayed a much more diverse range of
material cultures and subsistence patterns. This is, in fact, precisely how
things looked when the Indo-European language family first comes into
historical focus, because—beginning from the earliest time that we have
written records until about the 6th century AD—all of these regions were
dominated by groups who spoke closely related Indo-Iranian languages. At
the same time, however, some of these Iranian groups, such as the Scythians,
                                                                                                               
223
Id.
224
Id.
225
Id. at 15.
226
Id.
227
Id.
228
Id.

  61
tended to display more nomadic and pastoralist forms of life while others,
such as the Medians and Persians, tended toward more settled urban
empires.229 The present story merely suggests that we project a similar pattern
back into prehistory, when—beginning in about 3300 BC—it would have
been closely related dialects of Proto-Indo-European rather than Indo-Iranian
that dominated these regions.

David Anthony has also presented a wealth of archaeological evidence to


suggest that some of these early groups in the steppes (which he associates
with the “Yamnaya horizon”) then began to travel into the Balkans and up the
Danube River, where he believes they formed the basis for the Celtic-Italic
branch.230 Beginning in about 2900 BC, a later group from the steppes—
which shows up as the Corded Ware culture—then began to settle in some of
the regions to the north of the Celtic-Italic speakers, where they eventually
formed the basis for the Germanic branch on his view; and then later groups,
who began to settle along the Dnieper River, ultimately formed the basis for
the Balto-Slavic branch. 231 I accept all of Anthony’s archaeological
arguments in favor of these claims, though—on the present view—the various
stages of Proto-Indo-European that produced these branches, and that
dominated the steppes during these successive periods of time, would have

                                                                                                               
229
CUNLIFFE, supra note 175, at 302-09 (describing Scythians); Peter Turchin, A Theory
for Formation of Large Empires, 4 J. GLOBAL HIST. 191, at 202–03 tbl.2 (2009).
230
ANTHONY, supra note 22, at 225–458.
231
Id.

  62
initially remained highly coordinated with a larger set of Proto-Indo-European
dialects that were centered in the Eastern-Iran-Bactria-Indus-Valley region.
Anthony himself is, however an Indo-Aryan invasion or migration
theorist. Let me therefore pause for a moment to discuss an important
methodological point, which will help to explain why I am willing to rely on
Anthony’s archaeological evidence up until this point but am inclined to reject
his (further) view that the first Proto-Indo-European dialects to expand
through the steppes also originated in the steppes. Anthony is an
archaeologist, and it is a well-known methodological fact that changes in
material culture, which can be discerned in the archaeological record, do not
necessarily reflect changes in language (or vice versa).232 Hence, in order to
draw valid linguistic inferences from material findings in archaeology, one
needs some further theory that is empirically well supported and would
license the inference in question.
Anthony is well aware of this fact, and he has therefore developed the
very helpful notion of a “persistent material culture frontier,” which is a
geographical boundary (which is sometimes fluid) that can show any number
of material-culture changes or continuities on either side, but that nevertheless
maintains consistent material-cultural oppositions between both sides. 233
Based on empirical evidence from a number of known material culture
frontiers, Anthony has observed that “[l]anguage is strongly associated with
persistent material-culture frontiers.” 234 Hence, if archaeologists can find
evidence of persistent material-culture frontiers in a given region, then that
fact should provide some evidence that the people within the frontiers spoke a
common language. I accept this as an extraordinarily helpful methodological
point, and it is by reference to this theory that Anthony is able to infer that the
so-called Yamnaya horizon (which first formed in the Eurasian steppes
around 3300 BC and reflected the first widespread material culture frontier to
develop in the steppes) most plausibly spoke a related set of dialects.235 By
tracing continuous archaeological developments that show some of these
Yamnaya groups then began to migrate up the Danube river, where they show
up in precisely the same locations as the first historically attested Celtic-
speaking groups, Anthony is also able to provide strong empirical evidence
that the Yamnaya horizon spoke Proto-Indo-European dialects ancestral to the
Celtic branch.236 As noted above, he has provided similar accounts of the
origins of the Germanic and Balto-Slavic groups,237 and I have accepted all of
this evidence and all of these arguments.
At the same time, however, it is a further claim to assert that the
languages of the Yamnaya horizon originated in the steppes. Despite
Anthony’s highly illuminating theory of persistent material culture frontiers,
he has, moreover, offered no comparable theory that would license this further
                                                                                                               
232
Philip Kohl, Perils of Carts before Horses: Linguistic Models and the Undermined
Archaeological Record, AMERICAN ANTHROPOLOGIST vol. 111, at 109 (2009).
233
Id. at 102–121 (developing theory of persistent material culture frontiers).
234
Id. at 319–39 (applying the theory of persistent material culture frontiers to the
archaeological record to suggest Yamnaya horizon was a material culture frontier, and that they
therefore spoke common languages).
235
Id. at 223, 311.
236
Id. at 344, 367.
237
Id. at 367–68, 375–85.

  63
claim, and this claim is therefore both methodologically and empirically
unsound. The riverine-agricultural model of linguistic expansion, by contrast,
is both methodologically sound and empirically well supported, and it
identifies the Eastern-Iran-Bactria-Indus Valley region as the most plausible
location for the earliest expansions of Proto-Indo-European. Nichols’s theory
of linguistic spread zones is also methodologically sound and empirically well
supported, and it suggests that languages have tended to spread through the
steppes from contemporaneous centers of economic and political power.
Nichols’s theory thus further supports a Bactrian origin for the Proto-Indo-
European dialects that first spread through the steppes.
If we reject Anthony’s assumption that Proto-Indo-European dialects
began in the steppes, and replace it with the more methodologically sound and
empirically well supported view that Proto-Indo-European dialects spread into
the steppes from the Eastern-Iran-Bactria-Indus-Valley region beginning in
around 3,300 BC, then it should also be noted that all of Anthony’s extensive
archaeological evidence is actually supportive of the new story I have been
describing here. 238 This includes evidence he has found to suggest that
certain archaeological findings in the steppes—such as the Andronovo and
Sintashta sites—may have been Proto-Indo-Iranian speaking, based on the
similarities between their material cultures and some described in the
Vedas.239 I say this because this evidence cannot rule out the possibility that
Proto-Indo-Iranian languages were already spoken much more widely by this
time, among groups with a range of more or less related material cultures.
Hence, I view the current story as capable of absorbing all of Anthony’s very
extensive archaeological evidence relating to Indo-European prehistory.
Turning to those regions south of the steppes and south of the Black and
Caspian seas, some of the earliest Proto-Indo-European dialects would have
presumably tended to form a dialectical continuum stretching from eastern
Iran to northwestern India. At some point, perhaps around 2400 BC, 240 some
of the western groups in this continuum would have then begun to split off
from the Eastern-Iran-Bactria-Indus-Valley complex and move further to the
west, where they would have laid the foundations for the Greco-Armenian
branch. Some of these groups would have probably settled near Armenia at
first, where they would have formed the basis for the Armenian sub-branch.
Others would have then moved even further west into Asia Minor, along the
southern and western Anatolian coast and, ultimately, onto the Greek
mainland. These last groups would have formed the basis for the Greek-
speaking Mycenaean Civilization, which shows up in the archaeological
record of Greece in about the middle of the second millennium BC.241
Up until this time, the Proto-Indo-European language family and its
branches would have most plausibly exhibited “centum” features on the

                                                                                                               
238
For a more detailed elaboration of this argument, see Kar, supra note 203, at Section
4(E)(2)(ii).
239
See ANTHONY, supra note 22, at 371–411.
240
I use this date as an approximation because in order to place this branching event
somewhere between the Germanic and Balto-Slavic branches—as one would expect if these
events were to track the branching structure of the Indo-European language family as depicted
in Figure 7.
241
See CUNLIFFE, supra note 175, at 196-200.

  64
present view. 242 Shortly after the Greek separation from the Armenian
branch, however, the main stalk of Indo-European dialects (which was
distributed primarily in the steppes and in the Eastern-Iran-Bactria-Indus-
Valley region) very likely began to undergo a process of “satemization” on
the present view. This process would have therefore left unique linguistic
effects on this main stalk, which would have included the ancestors of the
Balto-Slavic branch and the Indo-Iranian branch, along with the Armenian
branch (due to its continuing geographic proximity with Iranian dialects). All
of the other Indo-European branches—despite their diverse geographical
locations—would have nevertheless continued to exhibit centum features.
The present view would thus explain why Tocharian, Celtic, Italic, Germanic
and Greek branches of Indo-European are “centum” languages, despite their
diverse geographic locations, whereas the Baltic, Slavic, Indo-Iranian and
Armenian branches are “satem languages.”243 This entire set of developments
is pictured in Figure 10 below.

                                                                                                               
242
The division between so-called “Centum” and “Satem” branches of the Indo-European
language family refer to different ways that three dorsal consonant rows from the reconstructed
Proto-Indo-European dialects ultimately evolved. MALLORY & ADAMS, supra note 69, at 46–
48. All of the following branches fall within the “Centum” category: Tocharian, Celtic, Italic,
Germanic, and Greek. All of these other branches fall into the “Satem” category: Iranian, Indo-
Aryan, Baltic, Slavic, and Armenian.
243
Many theorists have found it difficult to account for the geographic locations of the
centum and satem groups. See, e.g., ANTHONY, supra note 22, at 264, 308 (expressing
difficulty in accounting for these geographic divisions); BRYANT, supra note 74, at 147 (“This
neat east-west division, however, was short-lived. A centum Indo-European language called
Tocharian was found as far east as Chinese Turkestan (Xinjiang)”).

  65
c. The Age of Dissolution (1900 BC through ~800 BC)

Let us now turn to the period beginning in about 1900 BC. We now
have highly credible evidence that, beginning in about 1900 BC, the ancient
monsoon-fed river that formed the main lifeline for the Indus Valley
Civilization (and which I earlier associated with the Vedic “Sarasvati”) began
to shorten and become less and less conducive to agricultural production.244
As this happened, some of the urban Indus Valley sites were abandoned, and
new settlements began to appear and cluster further and further toward the
upper regions of this monsoon-fed river, which remained conducive to
agricultural productivity.245 The upper reaches of this river system also ran
very close to the upper portions of the Yamuna and Ganges rivers, and—as
the Sarasvati began to shrink even further—the archaeological record shows
further changes in settlement patterns as groups from the Indus Valley began
to move even further eastwards along the Gangetic Plain.246 These events also
appear to have involved an initial decline in the social complexity and
urbanization in these regions, and to have incentivized some increased
reliance on pastoralist forms of subsistence.247
When people speak of the “collapse” of the Indus Valley Civilization,
they mean to speak of this period. The traditional view essentially construes
this period as one in which small groups of pastoralist tribes from the steppes
were able to enter into northwestern India and completely replace the
indigenous languages that had been developing there for millennia through a
process of linguistic conversion. For all of the reasons discussed above,
however, we now have strong reasons to deem this assumption implausible.248
I therefore believe this period is better understood as one in which the great
majority of people in northwestern India (whether they engaged primarily in
agriculturalist or pastoralist forms of subsistence) already spoke closely
related Proto-Indo-Iranian dialects. This entire archaeological record would,
in other words, appear to reflect a gradual decline in the social complexity of
the region, which was brought on primarily by changes in monsoon patterns
along with disruptions in the productivity of the Sarasvati and certain
concomitant developments toward regionalization—but with no major
linguistic shifts.
Of course, given the long history of incursions into northwestern India
from groups entering from the northwest, 249 this period would have also
plausibly included some intrusions of new groups into the Indian
subcontinent. The collapse of the Indus Valley Civilization would have also

                                                                                                               
244.
Liviu Giosan et al., Fluvial Landscapes of the Harappan Civilization, PROCEEDINGS
OF NATIONAL ACADEMY OF SCIENCES, EARLY EDITION, available at
THE
www.pnas.org/cgi/doi/10.1073/pnas.1112743109.
245.
Id.
246.
Id.
247.
Id.
248
See Section 3, infra (discussing empirical evidence of the linguistic persistence of
major language families in major riverine settings, once they have reached equilibrium in
accordance with the riverine-agricultural model of linguistic expansion).
249
J.M. ROBERTS, supra note 117, at 429.

  66
left a power vacuum in the region, which would have plausibly allowed
certain pastoralist segments of northern Indian society to begin to exert
increased political power over their heavily weakened agriculturalist cousins.
It is, however, much harder—in my view—to tell whether the specific
pastoralist groups who were able to exert this heightened political authority
would have come primarily from the steppes or from closer regions like the
Hindu Kush, the Shivalik mountain ranges, Bactria, or the Indus Valley itself.
Any such groups would have been in long and protracted contact with a range
of other groups in and around Central Asia, Bactria and the Indus Valley, and
so we should expect that any of them would exhibit a range of distinctive
linguistic and cultural influences from these regions. It is, however, notable
that no one has been able to trace any archaeological developments that
reflect the clear movements of a single group from the steppes into
northwestern India during this period, and that the Vedas—which would have
been written just as these incursions were taking place on the traditional
view—make no reference to an external homeland.250 Hence, it is probably
most reasonable to think that the bulk of any pastoralist groups who were able
to assume increased political power during this period would have come from
more local regions than the steppes, with—perhaps—some admixture.
In any event, it is these pastoralist groups whose rituals and languages
are often said to be reflected in the earliest Vedic texts.251 As is well known,
these texts are in a very archaic language, which was preserved verbatim for
many generations before these texts were written down, due in large part to
their religious and liturgical significance. 252 In my view, this fact has,
however, often led to an unfortunate tendency to rely far too heavily on these
texts to furnish a complete description of Indo-Aryan speaking groups at the
time. We should therefore remember that we have no reason to think that the
Vedas were the only oral traditions that were being transmitted at the time.
The Indus Valley Civilization exhibited quite a high degree of social
complexity and division of labor,253 and so it more than likely that a range of
more specialized crafts and skills (or professions along with their systems of
informal norms) were also passed down in some form through various
informal trade groups—much like the “sreni” that show up from the earliest
points of recorded history in India and have served as merchant guilds or
proto-corporations. 254 In historical times, these traditions have proven
                                                                                                               
250
See BRYANT, supra note 74, at 197-223
251
Witzel, supra note 78, at 175-76; see also THE ENCYCLOPEDIA OF INDO-EUROPEAN
CULTURE, supra note 122, at 309-310 (“Vedic literature also makes it clear that we are dealing
with a largely pastoralist society which employed the horse and chariot; it makes no mention of
towns yet is geographically located where urbanism previously existed.”).
252
Witzel, supra note 78, at 90 (“The language of the RV is an archaic form of Indo-
European. Its 1,028 hymns are addressed to the gods and most of them are used in ritual. They
were orally composed and strictly preserved by exact repetition through rote learning, until
today. It must be underlined that the Vedic texts are ‘tape recordings’ of this archaic period.
Not one word, not a syllable, not even a tonal accent were allowed to be changed.”).
253
THE OXFORD COMPANION TO ARCHAEOLOGY, supra note 183, at 348 (entry on Indus
Valley Civilization, noting that “there is abundant evidence for social stratification and craft
and career specialization”).
254
See Vikramaditya S. Khanna, The Economic History of the Corporate Form in Ancient
India (Working Paper, manuscript available at
http://www.law.yale.edu/documents/pdf/cbl/khanna_ancient_india_informal.pdf (“The

  67
incredibly persistent and have remained intact and evolved in their own terms
through many changes in political regimes—as Vik Khanna has recently
observed.255 We have no reason to think things would have been any different
during prehistory.
It has sometimes been suggested that the Vedas reflect a more primitive
and less reflective sacrificial religious system than is found in the later
Upanishads.256 It is however, very plausible that the rituals described in the
Vedas were part of a broader culture, which would have sometimes invited
discussion and reflection on the deeper meanings of the rituals—very much in
the vein of the Upanishads.257 Otherwise it would have been difficult for the
Upanishads to construct such elaborate philosophical discussions on their
basis. Like almost all groups in the larger ethnographic record, the early
groups in northern India would have also almost certainly had a number of
more traditional stories, including cosmological ones and legends about local
heroes, which they would have been passing down orally. We should not
expect to have any surviving versions of these other traditions in language
that is as archaic as the Vedas, however, because there would have been no
reason to preserve these other traditions in the same pristine liturgical form.
Given the expansive geographic regions in which early Indo-Iranian
dialects would have been spoken on the present view, we should also expect
that Indo-Iranian speaking groups would have exhibited a much broader range
of religious practices and customs than are reflected in the Vedic texts—even
during the Vedic period. It would, after all, be an obvious error to infer from
the fact that the Vedic texts provide us with the earliest written attestation of
Indo-Aryan languages that these texts therefore describe the material culture
and social or religious customs of all Indo-Aryan speakers of their era. These
texts were written in a fairly localized region in the Punjab, and by a specific
group of Indo-Aryan speakers in a specific period of time. Hence, it is
unlikely, on the present theory, that these texts would have reflected the
customs or material cultures of all Indo-Aryan speaking groups throughout
the steppes and in the larger Eastern-Iran-Bactria-Indus-Valley region.. Many
later developments in India—such as the seeming re-emergence of Shaivism
as well as some religious traditions like Buddhism and Jainism that do not
treat the Vedas as authoritative—might therefore be understood as reflecting

                                                                                                                                                                                                                                                                       
corporate form (e.g., the sreni was being used in India from at least 800 B.C., and perhaps even
earlier, and was in more or less continuous use since then until the advent of the Islamic
invasions around 1000 A.D.”) (discussing the sreni and their functional relations to early
corporate forms found in both ancient Rome and Medieval Europe—which relations “urge us
toward a significant revision of the history and development of the corporate form”).
255
Id. at 51-52 (observing that the number, size and complexity of the sreni throughout
most of Indian history has been responsive to functional economic concerns and that these sreni
often evolved fairly autonomously under a variety of different governmental and political
structures).
256
See, e.g., ROBERT BELLAH, RELIGION IN HUMAN EVOLUTION: FROM THE PALEOLITHIC TO
THE AXIAL AGE 508 (“Thus, Vedic thought at the level of the Brahmanas remained archaic in
the terms of the typology of this book. There were, as in other archaic societies, forms of
mythospeculation that verged on axial insights but still remained archaic.”); id. at 509 (“[T]he
Upanisads represent the emergence of an axial breakthrough, or something very like it,” in
Indian religious thought.).
257
Id. at 509-27.

  68
the fact that Indo-Aryan culture was always broader than Vedic culture (even
within the Indian subcontinent) and never simply derived from it.258
The movement of northern Indian civilization eastwards along the
Gangetic plain would have then begun to create an emerging division between
the Indo-Aryan and Iranian sub-branches of the Indo-European language
family—with the Indo-Aryan branch located to east of the Indus River and the
Iranian branch located to the west—just as we see today.259 The movements
of these Indo-Aryan groups would have also placed them into increased
contact with certain Proto-Munda speaking groups, who would have
originally been concentrated along the lower Ganges. Subsequent
developments would have then placed Indo-Aryan groups into increased
contact with various Dravidian groups in the South. These developments
would have thus begun to leave a distinctive set of linguistic, cultural, and
genetic influences on the single Indo-Aryan branch of the Indo-European
family.260
This period of admixture—which began around 1500 BC and still
continues today—would have then helped to create the complex demographic
structure of modern day India. These events are pictured in Figure 11 below.

                                                                                                               
258
RICHARD KING, INDIAN PHILOSOPHY: AN INTRODUCTION TO BUDDHIST AND HINDU
THOUGHT 43 (1999) (“Sometimes an appeal is made to the classification of schools [of Indian
philosophy or religion] into āstika (affirmer) and nāstika (non-affirmer) traditions as evidence
of an indigenous Hindu notion of ‘orthodoxy.’ This mode of classification is usually taken to
imply that acceptance of the Vedas as revelatory knowledge (śruti) provides the grounds for
distinguishing orthodox Hindu schools from their ‘heterodox’ rivals—the Cārvākas, Buddhists
and Jainas.”). Although both the Rig Veda and Yajur Veda refer to Rudra, who is often
associated with a precursor to Shiva, Shiva is first clearly referred to in the Shvetasvatara
Upanishad. See ROSHEN DALAL, HINDUISM: AN ALPHABETICAL GUIDE 371 (2011) (entry on
Shaivism). At the same time, however, Shiva is often worshipped in the form of the Shiva
Lingam, which is a phallic symbol encompassed by a ring, and the “worship and magical use of
sex symbols can be traced to the period of pre-Harappa and Harappa cultures, if cultic
significance can be attributed to conical objects and ring stones of these cultures. . . . A large
number of phalli and ring stones in various sizes have been found at Mohenjo-daro and
Harappa”—thus suggesting that some elements of Shaivism may have predated the second
millennium BC on the Indian subcontinent. An Encyclopedia of Indian Archaeology vol. 1, at
276 (A. Ghosh ed., 1989).
259
ETHNOLOGUE, supra note 12, at 365-407, 815.
260
For a good discussion of some of the distinctive linguistic and cultural influences that
are found within the Indo-Aryan branch of the Indo-European language family, see Michael
Witzel, Central Asian Roots and Acculturation in South Asia: Linguistic and Archaeological
Evidence from Western Central Asia, the Hindukush and Northwestern South Asia for Early
Indo-Aryan Languages and Religion, in LINGUISTICS, ARCHAEOLOGY AND THE HUMAN PAST
(ED. OSADA TOSHIKI 2005). Recent genetic studies have also indicated that Indian populations
seem to reflect an ancient admixture event between two groups, which have been labeled
“Ancestral North Indians,” and which seem to be closely related to people in Central Asia and
Europe, and “Ancestral South Asians,” who are a distinctive genetic population who are not
closely related to any other major populations outside of the Indian subcontinent. See David
Reich et al., Reconstructing Indian Population History, 461 NATURE 489-94 (Sept. 24, 2009).
There appears to be some level of admixture throughout India, and there are very few traces of
Ancestral South Asian gene flow to Indo-European groups outside of the Indian subcontinent,
id., thus suggesting that Indo-Aryan groups have some distinctive general admixture.

  69
d. The Historical Age

The final period is the historical age, which began at different times in
different parts of the world, but which also witnessed the emergence of a
number of distinctive traditions of mega-empires among Indo-European
groups in different parts of Eurasia. These traditions emerged first in ancient
Persia; then in ancient Greece; then in ancient India; then in ancient Rome;
and then—much later—in northwestern Europe and, finally, Russia. 261
Whereas the traditional story would explain each of these developments as the
results of independent transitions from more nomadic and tribal forms of life,
however, which were rooted in a fairly primitive form of pastoralism, the
present theory suggests that all of them more likely had a much deeper and
more complex social prehistory relevant to the emergence and stability of
Indo-European forms of social complexity.

e. Explaining the Larger Body of Evidence Relevant to Indo-European


Prehistory

The foregoing story embeds the claim that Indus Valley Civilization
spoke dialects of Proto-Indo-European within a larger narrative about Indo-
European prehistory that would explain (or at least render coherent) an
extraordinarily broad range of evidence relevant to this larger topic. Because

                                                                                                               
261
For a description of these six different traditions, see Kar, supra note 203, at Fig. 22
and accompanying text.

  70
I have argued for this claim in more detail elsewhere, I will simply list some
of the relevant bodies of evidence here in order to illustrate the point. In most
cases, my reasons for claiming that the present story would explain a
particular body of evidence should be clear enough from prior discussions,
but readers interested in seeing more detailed developments of these
arguments should consult my larger body of work on this topic.262
The larger body of evidence that the present theory would explain
includes: (1) the most plausible phylogenetic structure of the Indo-European
language family itself as depicted in Figure 7, including (1-a) its well-known
branching structure and (1-b) the identification of a plausible “main stalk” in
the archaeological record that could have generated these many branches; (2)
the well-known centum-satem division (including the diverse geographic
locations of the centum and satem groups); (3) the larger set of dialectical
groupings that Gamkrelidze and Ivanov have identified within the Indo-
European language family;263 (4) the most credible and somewhat conflicting
findings of linguistic paleontology;264 and (5) the far eastern location of the
Tocharian branch.
The present theory would also explain and can thus find support in: (6)
all of David Anthony’s extensive archaeological evidence concerning Indo-
European prehistory;265 (7) all of Joanna Nichols extensive linguistic evidence
                                                                                                               
262
I have developed all of these ideas at much greater length in Robin Bradley Kar, On the
Early Eastern Origins of Western Law and Western Civilization: New Arguments for a
Changed Understanding of Our Earliest Legal and Cultural Origins, University of Illinois Law
Review vol. 2012 (forthcoming 2012), available at http://ssrn.com/abstract=2039500 (Part 1),
http://ssrn.com/abstract=2039502 (Part 2), http://ssrn.com/abstract=2039504 (Part 3). This
work also collects more detailed citations to all of the bodies of evidence that are described in
this section.
263
Gamkrelidze and Ivanov have found evidence to suggest that Proto-Indo-European was
originally had two basic dialectical groupings, which were in close contact. [CITE]. Anatolian
then split off from this larger grouping very early, just as on the current story. They then find a
Tocharian-Celtic-Italic grouping, which later separated into two groupings, with Celtic-Italic in
one, and Tocharian in the other. On the present view, the initial Celtic-Italic-Tocharian
grouping would have occurred when Celtic-Italic groups first spread through the steppes and
reconnected with the Tocharians. It would have then split when the Celtic-Italic groups began
to move up the Danube and into the Balkans, and the Tocharians moved further to the east.
The remaining dialects (which include the precursors of Aryan, Greek, Armenian, Baltic,
Slavic, and Germanic) would have thus been left in contact within the larger regions that
connected the steppes to the Eastern-Iran-Bactria-Indus Valley region. For a time, there is
evidence that Indo-Aryan dialects formed an intermediary between the Germanic-Baltic-Slavic
groups and the Greek-Armenian ones, which is exactly what the current story would predict.
But the Balto-Slavic-Germanic group would ultimately be split from the Aryan-Greek-
Armenian group once Altaic groups began to spread through the steppes.
264
Linguistic paleontology is not highly credited by experts these days. The present story
suggests that, in trying to reconstruct Proto-Indo-European terms from the existing Indo-
European branches, linguists are likely to reconstruct the geography and worldview not of the
Proto-Indo-Europeans in general but rather of those segments of Proto-Indo-European society
that were most heavily involved with the spread of this language family to many new regions—
which, we should expect, would have often been pastoralist groups in the steppes. But a
number of scholars have, in fact, pointed to findings of linguistic paleontology to suggest a
much broader range of locations and cultural patterns than this—which is precisely what the
present theory would predict.
265
I remind the reader that I have argued that Anthony’s evidence is consistent with the
claim that Proto-Indo-European dialects spread into the steppes beginning in around 3300 BC
from the Eastern-Iran-Bactria-Indus-Valley region. Once this fact has been recognized, all of
his extensive archaeological evidence—including his evidence that certain steppe groups like

  71
favoring an early Bactrian homeland for Proto-Indo-European, which suggests
that (7-a) Proto-Indo-European absorbed a number of early Semitic and
Sumerian loanwords through an intermediary but not directly and that (7-b)
Proto-Indo-European could not have plausibly emerged from the steppes or
the Ukraine but (7-c) does not rule out the possibility that Proto-Indo-
European dialects originally expanded throughout the larger Eastern-Iran-
Bactria-Indus-Valley region and would, in fact, (7-d) be augmented by the
present theory because the present theory would help explain why Bactria
would have played such an important and consistent role in spreading Indo-
European languages through the steppes (i.e., because ancient Bactria would
have been part of a much larger and more powerful expanding socio-cultural
complex centered in the Indus Valley).
The present theory would also explain: (8) the existence of a
reconstructible Proto-Indo-European term for “horse” even though (8-a) the
horse was apparently unimportant to the Indus Valley Civilization and (8-b)
horses are not indigenous to the Indian subcontinent and show up only rarely
if at all in the early archaeological record prior to 1500 BC; (9) the
descriptions of the material and social cultures found in the Vedic texts,
including (9-a) the pastoralist proclivities of these groups, (9-b) their
predominantly tribal social structure and (9-c) the increased importance they
attached to the horse as pastoralist segments of a specific Indo-Aryan society
in the middle of the 2d millennium BC). The present theory would similarly
explain and can thus find further support in: (10) the larger archaeological
record of agricultural production and social complexity from the Indus Valley,
Bactria, and related places like Mehrgahr; (11) our best contemporary
understanding of the hydronomy of the Indus Valley, including the centrality
of a particular monsoon-based river for this Civilization; (12) our best
contemporary understanding of the Vedic “Sarasvati,” as clarified by Ashok
Aklujkar’s recent work in this volume; (13) the incredibly pervasive Indo-
Aryan hydronyms and toponyms that are found in northwestern India; and
(14) an important part of Michael Witzel’s linguistic evidence, which suggests
the early Sanskrit texts exhibit earlier influences from a language that
resembled Proto-Munda than from Dravidian languages.
The present theory would also explain and can thus find some additional
support in: (15) our best understanding of the famous “Harappan” seals,
which (15-a) have not yet been translated in a definitive manner and (15-b)
may not contain a full-blown language at all but for which (15-c) plausible
and internally consistent Indo-Aryan translations have been devised
(including one prominent one by S.R. Rao himself) and which (15-d) show
certain statistical patterns, along with a seeming base ten numbering system,
that increase the plausibility of their reflecting an Indo-European language.
The present theory is also fully consistent with and would explain: (16)
recent genetic evidence, which suggests that a major admixture event took
place between so-called Ancestral North Indians and Ancestral South Indians
and may have begun around 1500 BC; (17) genetic evidence that suggests that
there were nevertheless no major population replacements or genetic
intrusions in northern India at or around 1500 BC (or, indeed, for the entire
                                                                                                                                                                                                                                                                       
the Andronovo or at Sintashta may have spoken Indo-Iranian dialects—is supportive of the
current theory.

  72
time beginning in around 4500 BC until the periods when historical record
begins around 800 BC); and (18) evidence that suggests that the Indian
populations east of the Indus exhibit a number of distinctive linguistic,
cultural, and genetic influences that differentiate them from most of the other
Indo-European branches.
But perhaps most importantly: the current theory is also uniquely
consistent with all of the new arguments developed in this article, including:
(19) the predictions of the riverine-agricultural model of linguistic expansion
and (20) the extensive empirical evidence concerning the persistence of the
major linguistic phenomena to arise in accordance with the riverine-
agricultural model of linguistic expansion within these riverine locations.
Finally, the current theory can also help to explain (21) many of the larger
patterns of Indo-European social complexity that we see in the historical
record.
Depending on how one elaborates the details of an Indo-Aryan invasion
or migration theory, this more traditional theory will, by contrast, have a hard
time explaining at least the following phenomena from the above list: (1-a)
the branching structure of the Indo-European family tree, (1-b) the need to
identify a plausible main stalk in the archaeological record that could have
produced so many important branches, (2) the centum-satem division, (5) the
far eastern location of Tocharians, (10) the increasing evidence of continuity
in developments from Indus Valley Civilization to subsequent developments
in northern India, (11-12) the apparent centrality of the Vedic Sarasvati to the
Indus Valley Civilization, (17) the evidence that no major population
replacements occurred in northern India in the 2d millennium BC, and (21)
the larger patterns of Indo-European social complexity that show up in the
historical record.
The traditional view is also literally inconsistent with all of the new
evidence presented in this article, including (19) the predictions of the
riverine-agricultural model of linguistic expansion and (20) the extensive
empirical evidence concerning the persistence of major linguistic phenomena
that have reached equilibrium in accordance with the riverine-agricultural
model of linguistic expansion around a major riverine location. Further doubt
is cast upon most versions of the Indo-Aryan invasion or migration theory by
(7) Joanna Nichols’s linguistic evidence, which suggests that a steppe
homeland for Proto-Indo-European is implausible and by (13) the pervasive
Indo-Aryan hydronyms and toponyms in the northwestern portions of the
Indian subcontinent. When placed in combination with all of the rest of this
evidence, some of the Indo-European translations of the Harappan seals that
have been offered by people like S.R. Rao—and were mentioned in item
(15)—should also be viewed as having increased plausibility.
For all of these reasons, I believe this new story is ultimately better
capable of explaining (or at least rendering coherent) an extraordinarily
extensive body of evidence that is relevant to broader reconstructions of Indo-
European prehistory. These facts should address any threshold worries that a
Proto-Indo-European speaking Indus Valley Civilization would make these
broader reconstructions impossible. This larger body of evidence can,
moreover, now be understood as lending an additional and quite extensive

  73
form of evidentiary support to this article’s main linguistic claim—at least
once the claim is embedded within the right narrative.

5. A Brief Response to Michael Witzel

With this new proposal in hand, I now want to turn to some of Michael
Witzel’s recent work and provide a direct response to his arguments. Michael
Witzel is one of the most thoughtful and vigorous modern proponents of an
Indo-Aryan invasion or migration theory, and he has unearthed a number of
important bodies of linguistic evidence that speak to these issues. His work
has proven enormously influential, and it is particularly helpful for present
purposes because Witzel believes he has found decisive evidence that can not
only establish an Indo-Aryan invasion or migration theory but will also allow
us to locate the more specific movements of the pastoralist groups who (in his
view) first brought Indo-European languages and cultural traditions to the
Indian subcontinent. Many of Witzel’s most important arguments and
evidence are collected in Central Asian Roots and Acculturation in South
Asia: Linguistic and Archaeological Evidence from Western Central Asia, the
Hindukush and Northwestern South Asia for Early Indo-Aryan Languages
and Religion,266 and I will therefore concentrate on his arguments in this
piece.
In my view, the linguistic evidence that Witzel has produced is
incredibly helpful for reconstructing Indo-European prehistory, but, for
reasons I will make clear, this evidence ultimately underdetermines the choice
between his Indo-Aryan invasion or migration theory and the new theory
developed here. By this, I mean that his evidence does not uniquely support
his current theory and could be explained equally well or better by this new
theory. Once our full theoretical options have been clarified, Witzel’s
evidence is, moreover, often better construed as favoring the present theory.
At the end of the day, these competing theories will, moreover, need to be
assessed in terms of their overall capacity to explain (or at least render
coherent) not just Witzel’s evidence but also the broadest range of evidence
that is relevant to these underlying topics. This larger body includes all of the
new considerations discussed in this article as well as all of the bodies of
evidence described in the last section. Once Witzel’s evidence has been
combined in this way, we will see that we now have compelling reasons to
reject the traditional view and reorient our basic understanding of Indo-
European prehistory. Or so—at least—I will be arguing.
Let me begin by introducing Witzel’s story itself, which largely tracks
the dominant view in most scholarly circles. Witzel essentially believes that
Proto-Indo-European dialects were widely dispersed through the western
steppe regions by about 3300 BC; that some of these groups then began to
branch off to form the Celtic-Italic, Germanic, Greco-Armenian and Balto-
Slavic groups from these regions; that early Proto-Indo-Iranian groups then
                                                                                                               
266
Michael Witzel, Central Asian Roots and Acculturation in South Asia: Linguistic and
Archaeological Evidence from Western Central Asia, the Hindukush and Northwestern South
Asia for Early Indo-Aryan Languages and Religion, in LINGUISTICS, ARCHAEOLOGY AND THE
HUMAN PAST (ED. OSADA TOSHIKI 2005).

  74
moved eastwards through the steppes where they came into contact with some
Uralic and Yeneseian speaking groups (who lived just north of these steppe
regions); that these Proto-Indo-Iranian groups very plausibly (though not
necessarily) included certain well-known groups from the steppes, like the
Sintashta sites and the Andronovo cultures; that these Proto-Indo-Iranian
groups then moved southward into Bactria after the demise of the BMAC
Civilization; that these Proto-Indo-Iranian groups picked up a number of
distinctive Central Asian linguistic and cultural influences at this time but also
completely replaced the BMAC Civilization’s languages; that some of these
groups subsequently began to move into northwestern India beginning in
around 1500/1200 BC (while others began to move westwards through the
Central Iranian plateaus); that the groups who entered into the Indian
subcontinent absorbed certain other distinctive influences from groups in the
Hindu Kush and from the remnants of the Indus Valley Civilization; and that
these new pastoralist groups were able to subjugate the indigenous people of
northwestern India and completely replace their languages through conversion
using a “newly imported elite kit (Ehret) of Vedic language, ritual, poetry,
horse breeding and pastoralism.”267
If this story were true, then it would obviously contradict the main
claims of this article, and so I want to take a closer look at the evidence that
Witzel cites for these claims. As we will see, Witzel bases many of his claims
on an important body of evidence that he has unearthed in the form of loan
words and other cultural and religious influences in the early Vedic and
Avestan texts. At one point, Witzel describes the basic structure of his
argument as follows: “[t]hese loan words and their inherent concepts, as well
as the earlier ones from the Ural area, the steppes and the high mountains of
Central Asia . . . provide decisive information about the track of the speakers
of Indo-Iranian and pre-OIA before they entered the mountains of the
Hindukush and descended into the plains of the Indian subcontinent.”268 So
let us take a closer look at the evidence that Witzel takes to provide decisive
information for his view and ask whether this evidence really favors his
theory over the newer one developed here.

a. Step One: Proto-Indo-European Groups in the Steppes in 3300 BC

We should begin by bracketing any and all evidence that Witzel might
be relying on for his belief that certain Proto-Indo-European groups, who
ultimately gave rise to the Celtic-Italic, Germanic, and Balto-Slavic groups,
were widely distributed through the western steppes by about 3300 BC. We
should bracket this evidence because the current theory agrees with Witzel on
this claim, and so evidence of this kind cannot distinguish between these two
theories.
Of course, the current theory also proposes that Proto-Indo-European
dialects were distributed much more widely throughout the Eastern-Iran-
Indus-Valley region prior to 3300 BC. With regard to these other regions, we
should therefore notice that Witzel takes as his starting point that “little to
                                                                                                               
267
Id. at 172.
268
Id. at 88 (emphasis added).

  75
nothing is known about the language(s) spoken in the areas east of
Mesopotamia (Hurrite, Akkadian, Sumerian, Elamite), and those west of the
Indus area.”269 He adds that: “The language(s) of the Indus civilization also
are by and large unknown, that is if we neglect the materials that can be
distilled from the materials contained in the earliest texts in Indo-Aryan, the
Vedas, but which have unfortunately been overlooked for that purpose.”270
It is important to understand what Witzel is saying here. When he says
that “little or nothing is known” about these regions, he means that we have
no direct and uncontroversial translational evidence of the languages spoken
in these regions prior to the historical period when Indo-Iranian languages
show up as dominant. Nor do have any recorded descriptions of these
languages. Witzel’s proposal is thus to try to reconstruct some of these
languages by looking for records of linguistic contacts with others groups that
can be found in the form of loan words or other influences in the earliest
Vedic and Avestan texts. It follows that Witzel’s only reasons for dismissing
the possibility that other people might have been speaking Proto-Indo-
European languages throughout a broader region must be based on these same
linguistic considerations and readings of the early Vedas.
These linguistic considerations will be discussed in more detail below.
At this stage, we need only recognize that nothing about the first class of
evidence under discussion (i.e., the class of evidence suggesting that Proto-
Indo-European languages were most likely spoken in the steppes west of the
Altai mountains by about 3300 BC) can rule out the possibility that Proto-
Indo-European languages were spoken throughout the Eastern-Iran-Bactria-
Indus-Valley region prior to this time as well. Hence, nothing about this first
class of evidence can favor Witzel’s theory over the new one developed here.

b. Step Two: Proto-Indo-Iranian Groups in the Steppes North of Bactria

Witzel’s second claim is that we have decisive linguistic evidence to


place the very first Proto-Indo-Iranian speaking groups in the steppe regions
somewhere north of Bactria and south of the taiga forest belt. Witzel’s
evidence for this claim is, however, notably limited to the existence of certain
early Proto-Indo-European loan words that have been found in both the Uralic
and Yeneseian languages.271 These language families inhabited the regions
just north of Witzel’s posited proto-Indo-Iranian homeland, and Witzel infers
from these loan words that “the [Indo-Iranian] languages must have come
from the northern steppe areas as the early (Proto-IIr.) loans into Proto-Uralic
(asura, Koivulehto 2001: 247) and Yeneseian (art’a) clearly indicate.”272
This “must” is, however, methodologically unsound because all that
evidence of this kind can actually establish is that early Proto-Indo-Iranian
languages were spoken in the northern Steppes from a very early time. This is

                                                                                                               
269
Id.
270
Id.
271
Witzel also produces some evidence that the terms the Uralic and Yeneseian groups
used to reference Indo-Iranian speakers suggests that these groups were to the south of the
Uralic and Yeneseian groups, see id. at 163, but this evidence will not distinguish between our
two competing theories.
272
Id. at 166.

  76
something that both Witzel and the current theory agree upon, however, and
so this evidence cannot ultimately distinguish between these two theories. In
order to support his theory over the current one, Witzel would need to cite
some additional evidence to establish that Proto-Indo-Iranian dialects were
not spoken in other regions like Bactria and the Indus Valley before this time.
Has Witzel provided any such evidence?
None of the evidence discussed thus far can help Witzel at this stage of
the argument. There is also another argument that Witzel cannot legitimately
make at this point. In speculating about the time of the first purported Proto-
Indo-Iranian arrivals in Bactria, Witzel suggests, at one point, that these
developments most likely took place after the demise of the BMAC
Civilization and perhaps around 1500 BC. His argument for this proposal
relies heavily on the archaeological record of the BMAC Civilization. At one
particularly critical point, he argues as follows: “Most notable is the absence,
so far, of horse remains, horse furniture, chariots (invented around 2000 BCE)
and clear depictions of horses in stratified BMAC layers.”273 Witzel then
suggests that: “One can hardly imagine the IIr.s [Indo-Iranians] without their
favorite prestige animal, the horse.”274 His conclusion is that Indo-Iranian
speaking groups most likely entered Bactria after the demise of the BMAC
Civilization.
It is, however, not at all hard to imagine Indo-Iranian speaking groups
who did not prize the horse: one need only imagine that the BMAC
Civilization and the Indus Valley Civilization were also Indo-Iranian speaking
groups, as the current theory suggests. What Witzel has essentially done here
is infer from the fact that the ancient Vedic and Avestan texts describe a
particular material culture, which valued the horse as a prestige animal, to the
conclusion that certain other groups, who do not display that same material
culture or values, are not likely to have spoken Indo-Iranian languages. As
noted before, however, this type of inference is invalid, and so it cannot be
used to establish that the BMAC civilization spoke non-Indo-European
languages.
Nor can the evidence of Proto-Indo-Iranian loan words in Uralic or
Yeneseian establish this proposition. To the contrary, there is an important
sense in which this particular class of loan words actually favors the current
theory over Witzel’s. If Witzel’s interpretation of this class of evidence were
correct, then the early period of contact that generated the loans from early
Proto-Indo-Iranian into Uralic and Yeneseian should have most plausibly left
reciprocal linguistic effects on the Proto-Indo-Iranian groups who later
descended into Bactria and northern India. No such reciprocal effects have
been found, however, and so the present theory can ultimately offer a better
explanation of this larger body of evidence. On the current theory, certain
early Proto-Indo-Iranian groups in the steppes would have most plausibly
come into contact with Uralic and Yeneseian groups at a time when Proto-
Indo-Iranian languages were widely distributed not only in the steppes but
also throughout the Eastern-Iran-Bactria-Indus-Valley region. These early
contacts between Proto-Indo-Iranian pastoralist groups in the steppes and a
number of neighboring Uralic and Yeneseian groups would have therefore
                                                                                                               
273
Id. at 168
274
Id. at 168.

  77
begun to create certain reciprocal linguistic effects, which we still see in
Uralic and Yeneseian. None of these effects would have extended to the more
settled Proto-Indo-Iranian groups to the south, however, because the Uralic
and Yeneseian groups were not in direct contact with these more southern
Indo-Iranian speaking groups. On the current view, these more settled Proto-
Indo-Iranian speaking groups are, moreover, the ones who spoke languages
directly ancestral to most of the surviving Indo-Iranian languages, and we
should therefore expect to find loans from Proto-Indo-Iranian into Uralic and
Yeneseian but not the other way around.
In sum, this second class of linguistic evidence does not in fact establish
that “the [Indo-Iranian] languages must have come from the northern steppe
areas,”275 as Witzel has suggested, and instead slightly favors the current
theory.

3. Step Three: Proto-Indo-Iranian into Bactria

Witzel’s third claim is that we have decisive reasons to think that proto-
Indo-Iranian groups subsequently moved from the steppes into Bactria. In
order to support this claim, Witzel engages in a comprehensive set of
linguistic inquires, which reveal a series of loan words in both the Rig Veda
and the Avestan that appear both to be non-Indo-European and to have a
plausible Central Asian source. Witzel attributes these influences to an
extinct language spoken by the BMAC Civilization, and suggests that these
influences therefore evidence the passage of Indo-Iranians groups from the
steppes through Bactria on their way into northwestern India. Witzel adds to
this the important observation that certain religious developments, which most
likely arose in or around Bactria (and that he can sometimes discern in the
archaeological record from the BMAC sites themselves), appear to have
influenced some of the religious and mythological views that are expressed in
the Rig Veda and Avestan in important ways.
This portion of Witzel’s argument is of particular importance to us
because Witzel argues that many of these Central Asian linguistic and cultural
influences are not found in any of the other Indo-European branches—with
the one unsurprising exception of Tocharian. He believes that this third body
of evidence therefore rules out Nichols’s proposal of a Bactrian homeland for
Proto-Indo-European, and—if he were right about this—then this fact would
have obvious implications for the current theory as well.276 Here is how
Witzel puts the point: “That Bactria/Sogdia could be the locus of PIE
therefore is at least very doubtful, if not simply impossible. If the localization
were indeed correct, all IE languages should have received the same ‘BMAC’
substrate words that are typical for Old Iranian and Old Indo-Aryan”277
There is, however, a problem with this part of Witzel’s argument: it
ignores highly plausible but competing explanations of this very same body of
evidence. Witzel is relying here on evidence from linguistic substrates in the
ancient Vedic and Avestan texts, which—as we know—were composed by
closely related groups in the Punjab and Bactria, respectively. During the
                                                                                                               
275
Id. at 166.
276
Id. at 109.
277
Id. at 109.

  78
second millennium BC (when Witzel believes the earliest of these texts were
composed), there would have plausibly been a number of smaller groups in
neighboring places like Central Asia and the Hindu Kush who would have
therefore left some distinctive linguistic and cultural influences on these
particular Indo-Iranian groups. The Avestan texts appear to have been
composed in Bactria, moreover, and so some influences from any number of
other contemporaneous groups from near Bactria and other nearby parts of
Central Asia would have been nearly unavoidable. It should therefore be
unsurprising, on the new theory developed here, that Witzel has found
evidence of Central Asian influences in these texts.
In addition, some of these influences would have arisen after 3300 BC,
and hence after the western branches of Indo-European had begun to spread
westward into the steppes on the current view. Hence, contrary to Witzel’s
assertion, we should actually expect that a number of late (i.e., post-3300 BC)
Central Asian loan words and influences would be missing from these western
branches.
Witzel is right, on the other hand, that some earlier Central Asian
influences should have plausibly had a wider impact on a broader array of
Proto-Indo-European dialects on views like the current one. Witzel does not,
however, claim that there are no such influences. He only claims to have
identified some distinctive Central Asian influences in Old Iranian and Old
Indo-Aryan that are not found in the other branches, and this claim is
perfectly compatible with the current theory.
In addition, there are some specific classes of early Central Asian loan
words that should have predictably atrophied in the western branches once
these branches separated from Bactria. Some prime examples would include
any terms related to local (Central Asian) flora and fauna and—because of the
initially pastoralist and nomadic forms of life of the early western branches—
many terms relating to agriculture and settled village life. It is therefore
noteworthy that almost all of the distinctive Central Asian loans that Witzel
has found in the Vedic and Avestan texts fall into one of these categories.278
The remaining ones relate primarily to certain specific religious practices,
which plausibly developed sometime after 3300 BC and so sometime after the
western branches had already split from Bactria on the current view. 279
Hence, this third body of evidence does not, in fact, rule out the possibility
that Proto-Indo-European dialects were spoken throughout the larger Eastern-
Iran-Bactria-Indus-Valley region for a very long time prior to 1500 BC, and is
instead perfectly compatible with the current theory.280
We can now go a step further. If the BMAC Civilization had spoken a
non-Indo-European language, then we should expect to find not only some
distinctive loan words in the early Vedic and Avestan texts but also some
surviving members of this language family in and around ancient Bactria.
                                                                                                               
278
Id. at 110-121, 124-129.
279
Id. at 121-124.
280
In construing his evidence to rule out Nichols’s Bactria homeland hypothesis, Witzel is
also relying on a single body of linguistic evidence, and is thus ignoring the extensive body of
contrary linguistic evidence that Nichols has produced in favor of her view. Questions like
these cannot, however be decided in isolation, and—as I suggested in the last section—the
current story would provide a more harmonious and coherent explanation of a very large body
of evidence relevant to these topics.

  79
This follows from the significant pocket criterion. Witzel himself can, in fact,
be understood as acknowledging a somewhat weaker form of the significant
pocket criterion when he argues that the BMAC Civilization must have
spoken the extinct Central Asian language that he has tried to reconstruct
from the ancient Vedic and Avestan texts. In making this argument, Witzel is
essentially acknowledging that, given the importance of the BMAC
Civilization in the archaeological record during the period from 2300 BC to
1700 BC, this civilization must have left an important linguistic trace on any
newly arriving Indo-Aryan groups during the middle of the second
millennium. He is then inferring from the fact that the extinct Central Asian
substrate that he has tried to reconstruct is the only plausible candidate for this
role that the BMAC Civilization must have spoken this extinct Central Asian
language.
Of course, the language that Witzel has reconstructed was not significant
enough to survive, and so the significant pocket criterion suggests that this
language was not plausibly the BMAC Civilization’s language. If, however,
the BMAC Civilization did not speak this proposed language, then Witzel’s
work also suggests that there is no other substrate in the early Vedic or
Avestan texts that can plausibly be attributed to the BMAC Civilization. It
would thus follow that the BMAC Civilization must have spoken dialects of
Proto-Indo-Iranian, and it is for this reason that I believe Witzel’s work can
ultimately be viewed as supportive of the present theory.
Even if one were to reject the significant pocket criterion, Witzel would,
moreover, still be right that the language of the BMAC Civilization should
have at least left a substantial trace in the early Vedic and Avestan texts. In
my view, Witzel has provided us with the very best reconstruction possible
for a candidate non-Indo-European language to associate with the BMAC
Civilization. We might therefore ask whether Witzel’s evidence is more
suggestive of influences from a single Central Asian language, which was
spoken by an important riverine civilization like the BMAC Civilization, or if
this evidence is better understood as reflecting a more diverse set of linguistic
influences from a broader range of neighboring groups.
In my view, a close look at Witzel’s evidence suggests that the latter
scenario is ultimately more plausible even if we were to reject the significant
pocket criterion. It is, for example, quite clear from Witzel’s own
descriptions that many of the early loans that he has found could have come
from a number of different linguistic groups who were located further to the
West or in other areas adjacent to Bactria.281 Of the remaining loans, it is not
at all clear that any great majority of them came from a single language
family (let alone a single language family from Central Asia), and, indeed,
many seem to have most plausibly derived from different languages.282 In
describing some of his own etymologies, Witzel also says: “Naturally, we are
still very much in the realm of speculation here, as the available data are still
very sketchy and come from a variety of quite different languages and

                                                                                                               
281
See, e.g., id. at 130 (attributing a great number of loans to a “Macro-Caucasian”
language family).
282
Witzel is able to find some words that look to most plausibly refer to items that one
would most likely find in Central Asia, but this does not establish that all of the loans he finds
came from Central Asia.

  80
sources. It is also somewhat difficult to pin them down in time and place.” 283
He similarly acknowledges that “[m]uch more research is needed . . . to turn
these proposals into something closer to certainty.”284 Evidence of this kind is
precisely what one would expect if Proto-Indo-Iranian languages were spoken
throughout the Eastern-Iran-Bactria-Indus Valley region for a long time prior
to 1500 BC, but if the more specific Indo-Iranian authors of the Vedas and
Avestan texts wrote these texts did so at a time when they were in contact
with a number of smaller groups in Bactria and Central Asia who spoke
languages that are now extinct.
So let us step back for a moment to see what this all means. As noted
earlier, I believe that Witzel has, in effect, provided us with the very best
reconstruction possible for the identification of a non-Indo-European
language for the BMAC Civilization. If we accept the significant pocket
criterion, then the fact that this language does not exist in these regions any
longer cuts against its being the dominant language in and around the Oxus
River from 2300 to 1700 BC (which was the main period of BMAC
Civilization). Even if we were to drop the significant pocket criterion,
however, Witzel’s evidence would appear to be better explained by the
regular and predictable linguistic influences of a range of different groups
who lived near the authors of the Vedic and the Avestan texts at the time of
their composition. As noted above, this last fact is especially important
because Witzel’s work can also be read to suggest that there is no other non-
Indo-European language that can be plausibly attributed to the BMAC
Civilization. Hence, I believe that Witzel’s evidence can ultimately be
understood as adding some support to the current theory.
Before moving on, let me make one final point in support of this last
interpretation of Witzel’s evidence. By accepting it, we can avoid having to
make some of the more extraordinary assumptions that Witzel is sometimes
forced to make. For example, because Witzel believes that the Andronovo
groups in the steppes were Indo-Iranian speakers but the BMAC Civilization
was not, he is forced to suggest at one point that:

The incoming steppe people with Andronovo cultural traits


must have shed many of these characteristics in the Greater
BMAC area before moving on, as “not a single artifact of
Andronovo type has been identified in Iran or in northern
India,” all while keeping their IIr. [Indo-Iranian] Language—
and, somewhat differently from Mallory, also much of their
spiritual culture.285

Witzel is thus forced to propose a seemingly extraordinary process, whereby


the incoming Indo-Iranian speaking groups would have displayed a very
puzzling blend of large-scale linguistic retention (which would have resulted
in the wholesale linguistic conversion of all of the pre-existing groups in
Bactria with only minor identifiable substrate effects) along with a
                                                                                                               
283
Id. at 129-30.
284
See, e.g., id. at 130 (attributing a great number of loans to a “Macro-Caucasian”
language family).
285
Id. at 167.

  81
simultaneous wholesale material culture loss (as these same incoming groups
completely dispensed with their own material cultures and absorbed those of
the indigenous Bactrian groups that they were converting). For all of the
reasons discussed thus far, I do not believe that assumptions like these are
very plausible, and I therefore count it as a point in favor of the current theory
that it would allow us to explain the existence of Indo-Iranian languages in
Bactria without have to make these assumptions.

d. Step Four: Indo-Aryans in Northwestern India

Let us turn, finally, to Witzel’s last claim, which is that he has found
decisive evidence to suggest that Indo-Iranian groups subsequently moved
from the BMAC region into Northwestern India around 1500/1200 BC, and
brought with them certain Indo-European languages and cultural traditions for
the very first time. Witzel admits that “[t]he exact fashion of their [purported]
arrival still is unsolved,” but suggests that this assumed process “will have
included . . . a combination of trickling in, migration into marginally used or
unused land (especially after the collapse of the Indus Civilization), and
outright invasion of lands settled by remnant Harappan populations and their
non-Harappan neighbors in the Indus area.”286 Although he cannot pinpoint
these events in the archaeological record, he believes that he has other
evidence to show that the Indo-Aryan groups first entered the northwestern
portions of the Indian subcontinent around 1500/1200 BC.
In arguing for this last set of movements, Witzel relies primarily on two
bodies of evidence. The first is linguistic evidence of certain loan words in
the earliest Vedic texts, which establish that the authors of these texts were in
contact with certain groups who spoke a language that resembles Proto-
Munda and that Witzel calls “Para-Munda”. Witzel associates this language
with the language of the Indus Valley Civilization, and he believes these
linguistic substrates provide evidence of contacts between the incoming Indo-
Iranian groups and the remnants of the Harappan populations. The second
body of evidence that Witzel relies upon consists of a range of distinctive
religious and cultural influences from Central Asia, Bactria and the Hindu
Kush that Witzel has found in the Vedic texts.287
My first response to these two bodies of evidence should be unsurprising
at this point. We need to remember that all that evidence of this kind can
ultimately establish is, first, that the authors of the Vedas—who clearly spoke
Indo-Aryan languages—were influenced by groups in nearby Central Asia,
Bactria and the Hindu Kush; and, second, that the authors of these texts had
also come into some contact with certain groups who spoke languages
resembling Proto-Munda. Because the current theory predicts influences of
this very same kind, however, this evidence cannot actually distinguish
between Witzel’s theory and the new one developed here.
So how might one choose between these competing theories? Even if
we were to limit our attention to Witzel’s linguistic evidence, one relevant
                                                                                                               
286
Id. at 172
287
Witzel also relies at times on the high concentration of Indo-Iranian hydronyms and
toponyms in the northwestern regions of the Indian subcontinent, but this evidence favors the
current theory over his.

  82
question to ask is whether his evidence of early Para-Munda loans is more
suggestive of influences from the remnant populations of one of the most
important major riverine civilizations ever to have existed within our natural
history at the time; or whether this evidence is better understood as reflecting
the regular and predictable effects of early contacts with Proto-Munda
speaking groups who were centered further to the east in the Gangetic basin.
The significant pocket criterion would rule out the former interpretation of
Witzel’s evidence and, for reasons already discussed, the significant pocket
criterion is supported by a wealth of empirical evidence.
Even if one were to reject the significant pocket criterion, however, the
present theory provides an alternative and highly parsimonious explanation of
Witzel’s evidence. On the current theory, the loans that Witzel has found
would have most plausibly come into the early Vedic texts from ordinary
Proto-Munda speaking groups, who would have originated closer to the
mouth of the Ganges but would have come into increasing contact with Indo-
Aryan speaking groups as they moved into the Gangetic basin. An
explanation of this kind would thus conform much better to the known
geographic locations of Munda speaking groups than Witzel’s explanation
does, and, indeed, it would be surprising if loans resembling Proto-Munda did
not begin to enter into early Vedic Sanskrit from Proto-Munda speaking
groups from a very early time. Contacts of these kinds are also sufficient to
explain Witzel’s evidence, and so I believe this explanation better explains
Witzel’s evidence.
Witzel has, moreover, scoured the early Vedic texts for loans, and his
analyses have essentially revealed that there is no substrate other than Para-
Munda that shows up in a sufficiently robust manner to plausibly represent
the language of the Indus Valley Civilization. This is why Witzel infers from
his evidence that: “An important result therefore is, that the language of the
Indus people, at least those in the Panjab, must have been Para-Munda or a
western form of Austro-Asiatic.” 288 Once again, however, this “must” is
methodologically unsound because all that evidence of this kind can really
show is that either the Indus Valley Civilization spoke Para-Munda or there is
no plausible substrate that we can associate with the language of the Indus
Valley Civilization—in which case they must have spoken dialects of Proto-
Indo-European. For all of the reasons discussed thus far, I do not believe we
can attribute Para-Munda to the Indus Valley Civilization, and so I believe,
once again, that Witzel’s work can ultimately be read as lending some
additional support to the current theory.
We have now gone through the four major bodies of evidence that
Witzel relies upon to support his particular version of the Indo-Aryan invasion
or migration theory. In each case, we have seen that this evidence either
literally underdetermines the choice between his theory and the new one that I
am proposing, or ultimately favors the current theory upon closer
examination. At the end of the day, however, we cannot reasonably choose
between these two theories by looking only at one single source of evidence.
We need to ask which theory provides the best explanation of the entire body
of evidence relevant to these topics, and this larger body of evidence includes

                                                                                                               
288
Id. at 180.

  83
all of the new considerations developed in this article and all of the other
evidence discussed in prior sections. Once all this evidence has been
combined, I believe the choice between these two theories should now be
clear. In my view, we now have compelling reasons to think that the Indus
Valley Civilization spoke dialects of Proto-Indo-European.

Conclusion

Over the course of this article, I have been focusing on a fairly narrow
linguistic claim, but, for reasons discussed in the Introduction, I believe that
this linguistic claim has a much broader set of implications that should be of
interest to a wide audience. Some of these implications should be of special
interest to anyone who wants to understand the evolution or prehistory of any
Indo-European social, political or legal traditions that are relevant to
producing and sustaining important forms of modern social complexity.
Acceptance of this linguistic claim would, however, radically change our
understanding of the early developments and of some of the most widespread
and influential cultural traditions in the modern world. Because such a shift
would challenge many of our traditional understandings, this linguistic claim
is likely to encounter a great deal of resistance regardless of the strength of
the evidence.
Before closing, let me therefore remind the reader that the traditional
view has also been shaping a number of our modern research programs—
including many that have been aimed at identifying the factors that can help
produce and sustain large-scale social structures with the rule of law and
modern economic development. If the traditional view is wrong, then many
of these research programs are therefore being shaped to their detriment.
Hence, I do not believe that resolution of these issues is of merely academic
interest, and I believe that this issue deserves much more attention from
people working in a broad range of disciplines. As someone who has entered
into these topics from the outside, my sense is that many people who are
already heavily involved in the so-called “Indo-European” homeland debate
have, however, often become somewhat hardened in their positions and have
sometimes tended to focus on relevant evidence from their particular fields of
expertise without paying adequate attention to evidence from other fields that
might be relevant. This is obviously a bad thing for the pursuit of knowledge,
and it has become increasingly clear to me that there is no single field of
expertise that can claim authority over these particular issues. Their
resolution will require genuinely interdisciplinary effort.
In my view, there are, however, sufficient signs now of a coming
paradigm shift with regard to our understanding of early human prehistory to
warrant serious attention. If this paradigm shift occurs (and I believe that it
will), then we can also expect many fruitful discoveries to begin to emerge
from this new perspective. It has therefore been my pleasure to contribute to
a volume that should play a key role in some of these larger transformations.
Transformations like these will take time to gain their full momentum, but I
cannot imagine a better way for this volume to pay tribute to the extraordinary
work and accomplishments of Dr. S.R. Rao—who, after all, is responsible for

  84
discovering some of the very first sites of the Indus Valley Civilization and
bringing them to the world’s attention.

  85
  86
  87
  88
MAP OF CELTIC EXPANSIONS
1500 BC to 300 BC
(Danube)

ANSIONS
EXP Danube
C Ri
LTI

ve
E

r
C
Atlantic Halstatt
Sea La Tene Da
nub r
e
Ri ve Black
Sea

Indo-European
Branch Danube

Mediterranean
Sea
Settled by 1500 BC Expanded by 600 BC Expanded at 300 BC
(Core Area - La Tene) (Halstatt Culture) (Max. Celtic Expansion)

MAP OF SLAVIC EXPANSIONS


300 AD to 660 AD
(Dnieper)

IONS
XPANS per
E Dn i e
IC
LAV
S

nube
Da

Atlantic
D

Sea a nu
be
Black
Sea

Dnieper
Indo-European
Branch
Danube

Mediterranean
Sea
Core Area Expanded
(settled by 300 AD) (by 660 AD)

  89
  90

You might also like