PreprintPDF Available

The spread of Munda in prehistoric South Asia -the view from areal typology To appear in: Volume in Celebration of the Bicentenary of Deccan College Post-Graduate and Research Institute (Deemed University).

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

The history of the Munda branch of the Austro-Asiatic 2 family in South Asia and its relation to the other branches of this family have long been shrouded in mystery. While some studies place the origin of this family in South Asia, from where it spread to Southeast Asia, others see its origin in Southeast Asia, with a subsequent spread to South Asia in the west. As the original spread of the Munda languages in South Asia plays a key role in many of these hypotheses, I examine a claim on the earlier maximal spread of Munda languages made in a recent study (Rau & Sidwell, 2019) and suggest a revision of this hypothesized spread, based primarily on areal-typological grounds, which I believe more accurately reflects the true extent of the earlier spread of these ethnic groups in the subcontinent.
Content may be subject to copyright.
1
The spread of Munda in prehistoric South Asia the view from areal typology
John Peterson, Kiel University, Germany
1
Abstract
The history of the Munda branch of the Austro-Asiatic
2
family in South Asia and its relation to the
other branches of this family have long been shrouded in mystery. While some studies place the
origin of this family in South Asia, from where it spread to Southeast Asia, others see its origin in
Southeast Asia, with a subsequent spread to South Asia in the west. As the original spread of the
Munda languages in South Asia plays a key role in many of these hypotheses, I examine a claim on
the earlier maximal spread of Munda languages made in a recent study (Rau & Sidwell, 2019) and
suggest a revision of this hypothesized spread, based primarily on areal-typological grounds, which
I believe more accurately reflects the true extent of the earlier spread of these ethnic groups in the
subcontinent.
1 Introduction
South Asia is home to some 600 languages belonging to at least seven stocks: Indo-European
(including Indo-Aryan, Iranian and Nuristani), Dravidian, Jarawa-Onge, Andamanese,
3
Tibeto-
Burman/Trans-Himalayan, Tai-Kadai, and Austro-Asiatic, as well as various isolates such as
Burushaski, Kusunda and Nihali. Figure 1 provides an overview of most of these.
The linguistic history of South Asia before the advent of Indo-Aryan speakers is still largely
unknown. While we know that speakers of Indo-Aryan first appeared in the northwest of the
subcontinent some time before 1,000 BCE and subsequently spread southwards and eastwards, we
cannot be as sure of the prehistories of Dravidian, Munda and other languages/language families.
For example, the original speakers of the languages of the Munda branch of the Austro-Asiatic
family are considered by some (e.g., Kumar et al. 2007) to have originated in South Asia, from
where they spread to Southeast Asia. Most researchers however now assume the opposite direction,
viewing the Austro-Asiatic speakers of South Asia as descendants of migrants from the east,
perhaps in the Irrawaddy flood plains of Myanmar or the lower Brahmaputra in Assam and
Bangladesh (e.g. Diffloth, 2005).
In a very interesting recent study, Rau & Sidwell (2019) suggest that speakers of the form of
speech which was later to become the Munda languages arrived in South Asia ca. 3,500 4,000
years ago in the Mahanadi Delta and adjacent coastal plains, from where they later spread to other
regions. Rau & Sidwell assume a maritime migration consisting primarily of males. These
immigrants cultivated rice and millet and eventually established themselves as dominant in much
of eastern India. More important to our discussion here, these authors provide a very exact
description of what they consider to have been the maximal prehistorical spread of Munda
languages in South Asia, one which explicitly does not include the Gangetic Plains.
1
Many thanks to Paul Sidwell and Felix Rau for their insightful and critical comments on an earlier version of this
study. Although I am sure that they do not accept all of my conclusions here, their comments forced me to reconsider
and reformulate my arguments somewhat on a number of different points, for which I am grateful. Needless to say, all
remaining errors and misconceptions are my own.
2
Although the spelling “Austroasiatic” without a hyphen is more common, in this and other works I consistently refer
to this family as “Austro-Asiatic” as it consists of two components, “Austro” and “Asiatic”. This spelling brings it into
line with other language families, such as “Indo-European”, “Tai-Kadai”, “Afro-Asiatic”, etc.
3
Abbi (2009) argues that the languages of the Andaman Islands belong to two genealogically unrelated groups, whose
protolanguages she refers to as “Proto Ang” and “Proto Great Andamanese”.
2
Figure 1: South Asian Language Families
4
In the present study I will not attempt to evaluate Rau & Sidwell’s arguments with respect to
whether Proto-Munda speakers arrived in India via land or sea, or where exactly they first settled.
Rather, I will discuss what I consider more likely to have been the maximal spread of Munda
languages in prehistoric times than what these authors suggest. More specifically, I will argue that
the maximal prehistoric spread of Munda included the eastern half of the Gangetic Plains, although
Proto-Munda speakers were very likely not the only ethnic groups who inhabited these plains. My
arguments primarily come from linguistic typology in a very broad sense, including areal typology,
4
Modified version of the map found at https://en.wikipedia.org/wiki/Languages_of_India. “South Asian Language
Families, translated from Image: Südasien Sprachfamilien.png, from Language families and branches, languages and
dialects in A Historical Atlas of South Asia, Oxford University Press. New York 1992. Nihali and Kusunda are not
shown. Author Wikipedia User Bishkek Rocks. Translated by Wikipedia User Kitkatcrazy.” Modified by Anvita
Abbi and her research team to correctly portray the genealogical relations of the languages of the Andaman Islands
and to include the Tai-Kadai languages. Reprinted here with the kind permission of Anvita Abbi.
3
language contact, and spread zones vs. residual/accretion zones, but I will also make reference to
the presumed sedentary agricultural lifestyle of these groups as well as a few recent genetic studies.
The article is structured as follows. In Section 2, I provide a very general overview of Munda
and its place in the Austro-Asiatic family before I discuss the present spread of Munda languages
in Section 3. In Section 4 I present my arguments for assuming a larger prehistoric spread of Munda
to include the eastern half of the Gangetic Plains, divided into two main sections: linguistic
evidence (4.1) and recent genetic studies (4.2). Section 5, the conclusion, summarizes once again
my main arguments.
2 The Munda languages and Austro-Asiatic
The Munda languages form the western-most branch of Austro-Asiatic, which stretches from
Central India to Vietnam. Figure 2, from Sidwell (2015: 144), provides an overview of the extent
of this spread as well as of the major branches of this family.
5
Figure 2: The branches of Austro-Asiatic (Sidwell, 2015: 144)
Figure 3 presents the “traditional” internal classification of Munda, from Zide (1969: 412). In
recent years, this classification has undergone considerable revision. One recent revised
classification is given in Figure 4 from Sidwell (2015: 197), with a considerably flatter internal
classification. Despite all differences, however, all classifications from the last ca. 50 years agree
on the status of the North Munda branch and its subsequent bifurcation into Korku on the one hand
and the Kherwarian languages on the other, while there is much less agreement on the internal
classification of the rest of the family, i.e., Zide’s (1969) “South Munda”.
6
5
For detailed discussion of the internal classification of Austro-Asiatic, see the discussion in Sidwell (2015, especially
pp. 206-211).
6
An overview of many of these classifications is given in Anderson (2015).
4
Figure 3: The Munda languages according to Zide (1969: 412)
Figure 4: The revised Munda classification of Sidwell (2015: 197)
Munda
North Munda:
Korku
Santali, Mundari
Sora-Gorum
Juang
Kharia
Gutob-Remo
GtaɁ
3 The present spread of Munda
It has long been a matter of debate whether the Austro-Asiatic-speaking populations of South Asia
immigrated from Southeast Asia, or whether the Austro-Asiatic-speaking populations of Southeast
Asia represent the descendants of an eastward movement into Southeast Asia from India. At
present, however, there appears to be more general support for a migration into South Asia from
the east, for many reasons.
Perhaps the most obvious reason to assume a Southeast Asian homeland is the geographical
spread of Munda languages in South Asia today. With the exception of Korku, spoken in western
central India in Maharashtra and Madhya Pradesh, these languages are primarily concentrated in
the eastern half of the subcontinent, stretching from Jharkhand in the north to Odisha and
northeastern Andhra Pradesh in the South, and Chhattisgarh/Madhya Pradesh in the west to western
West Bengal in the east. No Munda languages are found further west than Korku. Figure 5, from
Rau & Sidwell (2015: 37), provides an overview of districts in India with significant Munda
populations.
5
Figure 5: Districts with Munda populations as listed in Ethnologue (Rau & Sidwell, 2019: 37)
While Munda-speaking groups are now found as far north as Nepal, Assam and Bhutan, as well
as in Bangladesh and several parts of West Bengal, many of these groups represent a later migration
of (often forced) laborers primarily from Jharkhand, Odisha and Chhattisgarh to work on the tea
plantations of these regions in the 19th century. In contrast, much of the movement into more
southern areas of West Bengal appears to have begun considerably earlier and was unrelated to the
tea plantations (cf. Section 4.1.1, especially note 9). In this study I focus on the traditional
homelands of these peoples and will only refer in passing to these later migrations.
Figure 6, from Rau & Sidwell (2015: 40), provides a rough overview of where the modern
Munda languages are spoken together with their proposed original Munda homeland in India.
Figure 6: Map of the different Munda branches without modern settlements in other areas
(Rau & Sidwell, 2015: 40) Abbreviations refer to the classification of Munda in Sidwell (2015), given in Figure 4
above: NM North Munda, K Kharia, J Juang, SG Sora-Gorum, GR Gutob-Remo, G - GtaɁ
6
Due to the position of Korku in western central India at such a large distance from the other
North Munda languages, Rau & Sidwell (2019: 37; 39) consider this language to be an outlier of
North Munda, reflecting an expansion of North Munda from the Chotanagpur Plateau in the east.
The authors attribute the present geographic separation of these two branches of North Munda to a
later expansion of Dravidian-speaking groups such as the Kui, Gond and Kurukh, which drove a
wedge between the previously contiguous North Munda-speaking groups. However, Rau &
Sidwell (2019: 38) see “no evidence for Kherwarian speakers in the Gangetic Plains prior to
colonial and post-colonial migrations.”
4 Some thoughts on the prehistoric spread of Munda in India
From an areal-typological/contact-linguistic perspective, one would expect that Munda languages
were once spoken over a considerably larger area than the present distribution of this group
suggests. In the following sections I therefore present my arguments for assuming that the
prehistoric spread of Munda also included the eastern half of the Gangetic Plains, comprising parts
of West Bengal (and perhaps neighboring regions of Bangladesh), Bihar and the eastern half of
Uttar Pradesh. These include primarily linguistic arguments (4.1) but also data from a number of
studies from the field of genetics (4.2) which point in the same direction.
4.1 The linguistic evidence
The linguistic evidence I cite in the following comes from three different areas, which I will deal
with separately in the following sub-sections. These are spread zones vs. residual/accretion zones
(4.1.1), arguments based on linguistic terms for agriculture and domesticated animals (4.1.2) and
areal-typological considerations (4.1.3).
4.1.1 Spread zones vs. residual/accretion zones
Johanna Nichols (1992) introduces the terms “spread zones” and “residual zones”, the latter of
which she later refers to as “accretion zones” (e.g., Nichols, 1997), to refer to two different types
of areas with respect to language density.
Spread zones are areas of rapid language spread, among other things with little genealogical
diversity, shallow language families and the use of a limited number of lingue franche or languages
of general communication between the different ethnic groups (e.g. Nichols, 1992: 16-17). Typical
of such areas is that they are regions which are easily accessible to outsiders, e.g. invaders and
large-scale immigration.
In South Asia, the Gangetic Plains as we presently see them are a textbook example of a spread
zone. As is known from the Vedas, northwestern South Asia was inhabited by various ethnic groups
when Indo-Aryan speakers first arrived, most likely some time before 1,000 BCE, and a number
of words from Dravidian and (Para-) Munda languages are claimed by some to have found their
way into these texts (e.g. Kuiper 1948; Witzel 1999), although this is disputed by others.
7
From
there some of these speakers began migrating eastwards at an early date, and by 600 BCE large
numbers of Indo-Aryan speakers had already settled throughout much of the Gangetic Plains,
7
Cf. Wikipedia, Substratum in the Vedic language for further discussion.
7
where Indo-Aryan quickly established itself as the lingua franca and later became the first language
of most of the inhabitants there.
Towards the southern and southeastern peripheries of the Gangetic Plains we find several hill
tracts such as the Chotanagpur Plateau in the southeast, and the Vindhya and Satpura ranges and
the Vindhyan Scarplands in central India to the south of the Gangetic Plains. Further to the
southeast we also find the northern Eastern Ghats running parallel to the east coast. These are
typical residual or accretion zones in Nichols’ terminology as they possess a relatively high
genealogical density compared to the rest of the sub-Himalayan subcontinent and presumably only
relatively recent lingue franche, with local bilingualism and/or multilingualism apparently having
long been the norm (cf. e.g. Nichols 1992: 21).
Typical of such residual or accretion zones is that they are generally unattractive to newcomers
from an agricultural perspective, as the soil tends to be difficult to cultivate, or with respect to trade,
as these regions are comparatively difficult to access, neither of these two features being true of
the Gangetic Plains, which are highly accessible and easy to cultivate. For reasons such as these,
languages in these residual/accretion zones tend to survive the onslaught of invaders/settlers better,
at least initially, as they offer refuge to ethnic groups who may be fleeing the newcomers and to
those who already live there.
It is precisely in regions such as these that we find the languages of the Munda family on and
around the Chotanagpur Plateau, the Garhjat Hills, on the Baghelkhand Plateau and parts of the
Mahanadi Basin and Dandakaranya, which together comprise the Eastern Plateau, or in the Satpura
Range (Korku). There are also several Munda groups in the northern Eastern Ghats, the isolate
Nihali in the Satpura Range, Dravidian languages such as Kurukh and Malto on the Chotanagpur
Plateau, and Dravidian Gondi in the Satpura Range. Traces of other languages which were once
spoken in these regions are also occasionally found. For example, in the Indo-Aryan language
Kurmali, spoken in Jharkhand on the Chotanagpur Plateau, we find traces in the core vocabulary
of a language which is no longer spoken and which at least at present cannot be traced to Indo-
Aryan, Munda or Dravidian.
8
These regions form two of the major physiographic divisions of India and include part of a third
division, all of which fit the description of Nichols’ residual/accretion zones quite well:
South Central Highlands consisting of the Satpura and Vindhya ranges and the Vindhyan
Scarplands (e.g. Nihali, Korku, Gondi);
Eastern Plateau consisting of the Chotanagpur Plateau, the Baghelkhand Plateau, the
Garhjat Hills, the Mahanadi Basin and Dandakaranya (e.g., North Munda other than Korku,
Kharia, Juang);
Eastern Hills of which the northern Eastern Ghats comprise the northernmost one-third
(e.g. Gutob-Remo, Sora-Gorum, GataɁ).
Note also that where we find Munda districts in northern coastal Andhra Pradesh and Odisha
in Figure 5, these are located where the Eastern Hills reach almost to the sea, i.e., these locations
are not wide coastal areas. These regions are shown in Figure 7.
In other words, from an areal-typological perspective the Munda districts from Figure 5 are
highly unlikely to be the entire original spread of Munda in South Asia, regardless of whether
Proto-Munda speakers entered South Asia by land or sea. Rather, these more likely represent
residual/accretion zones in which the above-mentioned languages have managed to survive.
Concentrating here on Munda, this strongly suggests that these languages were once also spoken
8
Cf. Paudyal & Peterson (2021: 22). This list includes such common words as very, last, open, eat, sleep and see.
8
in surrounding areas, in my opinion above all in the Gangetic Plains, where they disappeared when
the population there later switched to Indo-Aryan. Thus, barring recent migrations
9
we can say that
where Munda languages are now found these are spoken in regions which have traditionally been
of little interest to newcomers, both from an agricultural as well as an economic perspective, so
that these languages have managed to survive there.
Figure 7: Physiographic divisions of India
10
Chaubey et al. (2017: 493) note in this respect that the Vindhya and Satpura ranges are a “fringe
area” where a combination of the more rudimentary technological level of development of the
resident populations and geographical remoteness may have facilitated the gradual admixture and
assimilation of incursive populations willing to adapt to the subsistence strategies practiced locally,
while impeding the bearers of technologically more advanced cultural assemblages.
11
Taken
together, these observations suggest that the present-day spread of Munda, even including the areas
between Korku and the other North Munda languages where Gondi and Indo-Aryan languages are
now spoken (cf. Section 4.2), most likely does not provide a realistic indication of the full extent
9
With respect to the eastern-most Santali groups of West Bengal, outside of the so-called “tea districts”, it is generally
assumed that the Santals migrated to their present homeland in eastern Jharkhand from western and central Jharkhand,
hence the eastern-most Santali-speaking regions in Figure 5 represent a later development, perhaps as early as the mid-
14th century, although this eastward movement continued at least up to the 19th century (cf. Das, 2020: 1224-1225).
I also assume a similarly late (i.e., 19th century) migration into this region by other groups who are now also found
there, such as the Turi (North Munda), but who are otherwise found in western Jharkhand, eastern Chhattisgarh and
also in Bihar.
10
This map is from the Physical map of India with various physiographic divisions” in Wikipedia, Geography of
India: https://commons.wikimedia.org/wiki/File:Physical_Map_of_India.jpg, Creative Commons Attribution-Share
Alike 4.0 International license. Only the inset map “Physiographic divisions of India” is shown here.
11
Cf. e.g. also Heggarty (2014: 620) “[] the hunter-gatherers’ languages, if they survive at all, invariably end up
cantoned into inhospitable areas of little value to agriculturalists.”
9
of the spread of people who are assumed to have once belonged to the dominant agricultural
societies of eastern India (cf. Section 4.1.2).
One final note on spread zones is important here. As Epps (2020) shows in her discussion of
lowland regions in South America, a flat, easily accessible area with a river flowing through it does
not guarantee that the respective region will be a spread zone. As important as the terrain is for the
spread of languages, the cultures of the peoples who live there and their degree of social (in)equality
are equally important. For example, the lowlands of the Amazon which Epps discusses are
characterized by a high degree of social equality, where the different ethnic groups “relate to each
other as distinct parts of the same machine” (Epps, 2020: 285) and where language, among other
cultural traits, is seen as an essential part of the identity of the individual groups. Consequently, the
borrowing of words from one language to another is quite rare in these areas, although considerable
morphosyntactic convergence is found, as is expected in areas of such intense language contact.
As these societies are not characterized by top-down political or social structures, there appears to
have been no general lingua franca in at least some of these regions until the arrival of Portuguese
(Epps, 2020: 282). Thus, with respect to linguistic density, these zones more closely resemble the
above-mentioned residual/accretion zones than spread zones, despite their terrain.
With respect to the Gangetic Plains, the spread of Indo-Aryan languages throughout this region
is be expected from a hierarchical social order in this kind of terrain. This is in line with the social,
political, military and economic predominance of the Indo-Aryan newcomers to this region. We
can only speculate with respect to the social structures which were predominant in the region before
the arrival of these speakers. However we must be careful not to simply assume that the ethnic
groups which inhabited this region prior to the arrival of Indo-Aryan speakers also had similarly
hierarchical social structures. In fact, I would argue that these groups were in general more
egalitarian than the new arrivals, practicing small-scale agriculture, living in small villages and
perhaps also practicing a certain degree of hunting.
12
This is significant since Rau & Sidwell (2019: 45) note that convincing linguistic evidence for
a Munda substrate in the languages of the Gangetic Plains is lacking, as “we find no pattern of
linguistic remnants in residual zones around the Gangetic Plain.” However, I do not consider this
to be convincing evidence against an earlier Munda presence there. To begin with, formerly distinct
ethnic groups are generally believed to be incorporated into mainstream Indian society as “low-
caste” groups, generally performing menial, often “unclean” tasks. Thus, in order to verify a Munda
substrate in these languages (if indeed there is one) we would need detailed data on the lexicons of
the Indo-Aryan speech of these communities, which is not available at present.
However, I would argue that such data will probably never be found, at least not on a larger
scale. Assuming that these were indeed small-scale agricultural societies and relatively egalitarian,
it seems unreasonable to expect to find traces of one clear substrate in the entire region, as there
was no single privileged language. Rather, we would expect to find scattered traces from several
languages, many of them belonging to families about which we know nothing at present. A likely
candidate here is Kurmali, mentioned above, in which we find traces of an unknown language
(family) in the core lexicon (cf. footnote 8). But given what we now know about language spread,
we are unlikely to find a single substrate language throughout the entire region.
In other words, a lack of clear evidence for a Munda substrate in these languages is not
necessarily an argument against their presence in the eastern Gangetic Plains in earlier times it is
simply a lack of positive evidence that they were there.
12
There appear to have been no notable larger settlements in the lower plains, with urban centers only appearing after
600 BCE, with the arrival of the Indo-Aryan-speaking settlers from the west (Kulke & Rothermund, 1991: 52-53).
10
4.1.2 The importance of agricultural terminology for the prehistoric spread of Munda
Zide & Zide (1976) argue that Proto-Munda speakers were agriculturalists who most likely grew
rice and different types of millet and who kept domesticated animals, and that modern Munda
groups such as the Juang and the Birhor, who until recently were predominantly hunters and
gatherers, are “examples of reversion from a more complex culture to a simpler one.” (Zide & Zide,
1976: 1296) In light of my comments in the preceding sections I argue that groups such as these
“reverted” to a hunter-and-gatherer lifestyle only after moving into the Eastern Plateau, where
agriculture likely proved difficult, at least initially. For the sake of brevity, I will only cite here the
respective agricultural terms in English, without their suggested Proto-Munda forms.
Zide & Zide (1976) find evidence for Proto-Munda names for various types of fruits, such as
‘wild fig’, ‘mango’, ‘green or unripe mango’, ‘jamun or Indian blackberry’, ‘turmeric’, ‘tamarind’
and ‘(wild) date’ but more importantly also various words for ‘rice’ such as ‘uncooked, husked
rice’, ‘paddy’ and ‘cooked rice’, different types of millet and gourds, as well as words for ‘pestle’,
‘mortar’ and ‘husking hole’ (Zide & Zide, 1976: 1297-1315). While not all of these, especially
terms for ‘millet’, have known cognates in non-Munda Austro-Asiatic languages, these terms
nevertheless strongly suggest a familiarity of Proto-Munda speakers with these crops, tree fruits,
legumes and gourds and how to prepare them for cooking. Similarly, terms for domesticable
animals such as ‘dog’, ‘chicken’, ‘goat’, ‘pig’, ‘buffalo’, ‘cat’ and ‘cattle’ can be identified (Zide
& Zide, 1315-1324). The authors admit that this does not prove that these early Proto-Munda
speakers were agriculturalists, although the evidence is nevertheless quite suggestive:
“The data presented in this paper provides good evidence that the Proto-Mundas, presumably at least 3500
years [before the present, JP] (or earlier) at a conservative estimate, had a subsistence agriculture which
produced or at least knew grain in particular rice, two or three millets, and at least three legumes. Further,
the agricultural technology included implements which presuppose the knowledge and use of such grains and
legumes as food, since the specific and consistent meanings for ‘husking pestle’ and mortar’ go back, at
least in one item, to Proto-Austroasiatic.’ (Zide & Zide, 1976: 1324)
“Further, the existence of certain terms for agricultural operations (e.g. ‘winnowing’, ‘transplanting’)
strongly suggests that some degree of domestication of these plants was likely, and this in turn presupposes
some degree of sedentary agriculture.” (Zide & Zide, 1976: 1327)
Assuming that the Proto-Munda speakers were cultivators, something of a dilemma arises, one
which Rau & Sidwell (2019: 43-44) also recognize: Why would agriculturalists choose to settle in
the less hospitable hills of eastern and central India? If they did first inhabit the wetlands of Odisha
in this area, in line with Rau & Sidwell’s Maritime Hypothesis, why would they not have followed
the coast to the north, eventually reaching the Ganges Delta, and then have followed the Ganges
with its expansive plains upstream to the west? Surely the Gangetic Plains provide better land for
the cultivation of rice than the rough terrain which is characteristic of so much of the eastern and
central highlands. In contrast, assuming the prehistoric spread of Munda which I have suggested
above on different grounds and which includes the eastern half of the Gangetic Plains would
provide a simple solution to this dilemma.
In fact, based on archaeological evidence, Kingwell-Banham et al. (2018: 11) suggest a very
different scenario from that in Rau & Sidwell (2019). They propose that the agriculturalists of the
Odisha wetlands may in fact have come from the Gangetic Plains, bringing the cultivation of rice
with them.
13
This analysis, although it speaks against Rau & Sidwell’s Maritime Hypothesis, is
compatible with my assumption that the Proto-Munda speakers once inhabited not only the hill and
13
Or from the Vindhyan Region of central India, although this suggestion appears to be based entirely on similarities
in pottery.
11
wetland regions of Odisha but also the eastern Gangetic Plains. Thus, whether Proto-Munda
speakers first settled in the Mahanadi Delta or in the Ganges Delta, we should expect them as
cultivators of rice to have inhabited at least some sections of the eastern Gangetic Plains.
There are historical cases of settlement along the coast in this region. We know that these
coastal regions were settled by Indo-Aryan-speaking groups rather early. For example, the kingdom
of Kalinga once stretched along the coast in precisely this region, while the hill regions to the west
for some time remained largely the refuge of “unconquered tribes” (cf. e.g. Map 4 in Kulke &
Rothermund, 1991: 378). This makes the status of these hill regions as areas of refuge all the more
apparent, perhaps as a consequence of the war which led to the incorporation of Kalinga into the
Mauryan empire by Emperor Ashoka in 261 BCE, whose brutality he himself reports on (cf. among
others Kulke & Rothermund, 1991: 65).
4.1.3 The “Indo-Aryan East-West Divide”
The suggested spread of Munda in prehistoric times is also compatible with linguistic evidence
suggesting that the eastern half of the Gangetic Plains was once inhabited by large numbers of non-
Indo-Aryan-speaking ethnic groups, whereas the western half was Indo-Aryan-speaking, or at least
dominated by Indo-Aryan, by the 6th century BCE.
In a number of recent works by myself and my research team, we have examined the
morphological and syntactic structures of the modern languages of South Asia in an attempt to
learn more about prehistoric migration and settlement patterns. This is possible because when
different ethnic groups speaking different languages live side-by-side and trade with one another
and perhaps also have social contacts above and beyond this, many members of one or both of the
respective communities become bilingual over time. This eventually has an impact on the structures
of these languages and this type of information can reveal much about the history of these
languages and their speakers. For example, when a large number of adult speakers learn a new
language at the same time, this often results in a simplification e.g. of the case system of the new
language, while long-term community bilingualism from childhood onward can lead to
morphological complexification.
14
In Peterson (2018) I present the results of a revised and somewhat expanded data set from an
earlier study (Peterson, 2017); these are illustrated in Figure 8 in a NeighborNet visualization of
the data.
15
This visualization illustrates that the Western Indo-Aryan languages such as Hindi, Braj
Bhasha, Marathi and Konkani, as well as Nepali (a relatively recent arrival in eastern South Asia
from the west) cluster together structurally, and together with Dravidian languages such as Telugu
and Kannada (far left of diagram). In contrast, eastern Indo-Aryan languages such as Maithili,
Bengali, Odiya and others (center of Figure 8, dotted lines) cluster with North and South Munda
languages, as well as with eastern Dravidian languages such as Kurukh and Malto (right half of
Figure 8). Bhojpuri clusters with the languages of the east in this visualization, but as we will see
below, it is a borderline case and often clusters with the languages of the west when different
criteria are used.
14
For a detailed introduction to this area of linguistics, referred to as “sociolinguistic typology”, see Trudgill (2011).
15
NeighborNet (Bryant & Moulton, 2004) is often used in contact linguistics to portray the effects of language contact.
In these networks, the length of branches corresponds directly to the degree of divergence or “distance” between
individual languages. Instead of trying to find an optimal tree-like format to portray similarities and differences
between languages, NeighborNet suggests alternative trees, resembling networks, to portray the possible paths which
may be taken between two points when there are conflicting signals in the data, as is often the case in language contact.
12
Figure 8: NeighborNet visualization of the structural similarities of selected languages of South Asia (29 languages,
46 morphosyntactic features) (from Peterson, 2018)
A later study by myself and my research team came to similar results. Figure 9 shows the
results of a statistical analysis of 16 Indo-Aryan languages with respect to 217 morphosyntactic
features and their respective structural distance from the Munda languages. The names of the
languages in green are those Indo-Aryan languages which are structurally closest to Munda, while
the languages given in red are those which are maximally different from Munda. I refer to this
structural schism within Indo-Aryan, which runs through central Uttar Pradesh from north to south
(see following text), as the “Indo-Aryan East-West Divide”.
Figure 9: The Indo-Aryan East-West Divide (Ivani et al., 2021: 19)
13
As there is no natural barrier separating eastern and western Indo-Aryan from one another in
the Gangetic Plains, I attribute this schism to the fact that eastern India at the time of the eastward
expansion of Indo-Aryan was populated by a large number of ethnic groups, many of them Munda-
speaking but certainly also other groups, so that many of the first speakers of Indo-Aryan in eastern
India will have been adult second-language learners, which led to numerous morphosyntactic
simplifications in these languages (cf. Peterson, in press). To the west of this border, the number
of Munda and other non-Indo-Aryan-speaking groups was considerably lower, so that these
languages do not show such strong simplifications.
Note that Bhojpuri in Figure 9 clusters with the western languages, whereas it clusters with
the eastern languages in Figure 8.
16
This suggests that the border between eastern and western Indo-
Aryan is diffuse or “fuzzy” and lies somewhere in central Uttar Pradesh. As some dialects of
Awadhi show various features which are typical of eastern Indo-Aryan languages while other
dialects of the same language show more western features (Peterson, 2017: 244; Stroński &
Verbeke, 2021), I tentatively assume that Munda speakers were present in the eastern half of Uttar
Pradesh and that the diffuse East-West border was in central-to-eastern Uttar Pradesh, probably
between what are now the Awadhi and Bhojpuri-speaking areas.
This would mean that Munda languages were spoken in the eastern half of the Gangetic Plains,
but probably not much further west than central Uttar Pradesh: As enticing as it may be, I see no
evidence that Munda languages were spoken to the west of this region, and certainly not in the
Indus Valley Civilization.
17
4.2 Genetic studies
Studies in genetics, while still comparatively scarce for this region, are beginning to play an
increasingly important role in uncovering the prehistory of eastern and central India. For example,
Chaubey et al. (2017) investigate the genetic affiliation of the Dravidian-speaking Gond tribe,
located among others in the Satpura Range between Korku and the Kherwarian languages, and
conclude that despite speaking a Dravidian language closely related to Telugu, “all the Gond
groups shared extensive portions of their genomes within the group as well as with North and South
Munda groups” (Chaubey et al., 2017: 497). This leads them to assume large-scale language
shifting from Munda to Dravidian in that region.
There are also a small number of genetic studies suggesting that people of Austro-Asiatic
descent may be found throughout much of the Gangetic Plains. One of these is Chaubey et al.
(2008), which analyzes the genetic make-up of the Indo-Aryan-speaking Mushar.
18
The name of
this group derives from Indo-Aryan and means ‘mouse-eater’, as their traditional occupation was
to flush rats out of their holes in fields, which the Mushar also ate. As the map in Chaubey et al.
(2008: 43) shows, the Mushar are now primarily concentrated in northern and western Bihar and
in northern and eastern Uttar Pradesh.
16
Nepali is also a “flip-flop” language, being western in Figure 8 and eastern in Figure 9. This is probably due to the
fact that it is originally a western language that spread to the east in the past few centuries so that it, like Awadhi (see
the following main text), has eastern and western dialects showing very different morphosyntactic features.
17
Genetic studies such as Narasimhan et al. (2016) and Shinde et al. (2019) suggest this as well.
18
Also spelled Musahar, Mushera and Mushahar.
14
Based on samples collected from 168 Mushar, 135 “Austro-Asiatic” speakers
19
and 151 Indo-
European-speaking individuals, Chaubey et al. (2008) conclude: “Indeed, this analysis shows
unambiguously that the Mushar population clusters with the Austro-Asiatic populations both in the
mtDNA and Y chromosomal PCA slots.” (Chaubey et al., 2008: 44) They attribute this to language
shift from an Austro-Asiatic language to the Indo-European languages of the region. They also
note that some speakers apparently still speak a Munda language, although they do not mention its
name (cf. Chaubey et al., 2008: 42). This would be important information, as I am not aware of
any Munda language spoken in Uttar Pradesh and Bihar up to the border with Nepal, where the
highest density of Mushar live, according to the map in Chaubey et al. (2008: 43).
20
Studies by David Reich have uncovered similar examples. Reich and his colleagues identify
two main genetic groups in South Asia, which they term “ANI” for “Ancestral North Indians” and
“ASI” for “Ancestral South Indians”. The ANI group is ultimately related to western Eurasians and
derives from the migration of this group into the subcontinent from the Eurasian steppe, which
presumably also brought Indo-European languages to South Asia. The second group, ASI, is
hypothesized to be deeply related to the Andaman Islanders and to have been in South Asia for
several millennia.
21
Reich and his colleagues speak of the “Indian Cline” to refer to the different
proportions of ANI and ASI ancestry in the genetic make-up of individuals and/or ethnic groups.
In Reich et al. (2009: 492), a number of ethnic groups of South Asia are positioned with respect to
this cline, with Kashmiri Pandits showing the most affinity with Europeans and Dravidian-speaking
Kurumbas among those showing the least.
Not surprisingly, the Munda-speaking Santals and Kharia are not positioned along this cline
but at some distance from it, due to their Southeast Asian ancestry. More interestingly, this also
holds true of the Sahariya, a “low-caste” Indo-Aryan-speaking group whose members for this study
are from Uttar Pradesh, with the four samples for this study from individuals living near Allahabad
in the eastern part of the state. This suggests that this ethnic group originally spoke a Munda
language which they gave up at some point in favor of an Indo-European one.
Similarly, members of the Tharu ethnic group were also represented in this study. The Tharu
are Indo-Aryan-speaking tribal groups found primarily in the Nepalese lowlands, although the nine
samples in this study were from Uttarakhand, near Nainital. In this study, one sample clustered
closely with the above-mentioned Sahariya off-cline, while the remaining members of this ethnic
group in the diagram are either on or much closer to the cline.
The significance of this information is that we find tribal and “low-caste” groups/individuals
both to the south and north of the Ganges River showing genetic similarities with speakers of
Munda languages. These facts, combined with the data above from Chaubey et al. (2008) with
respect to the Mushar, suggest that what are now Bihar and the eastern half of Uttar Pradesh,
possibly up to the Tarai lowlands of southern Nepal, were once settled by Munda-speaking ethnic
groups and also by other groups just as what are now Gondi- (and Indo-Aryan-) speaking areas
between Korku and the remaining North Munda languages further to the east were once also likely
Munda-speaking (Chaubey et al., 2017).
19
Note that “Austro-Asiatic” here refers to well established Munda groups but also includes the Mawasi group of Sidhi
District of eastern Madhya Pradesh. Unfortunately, I could find no literature confirming that this group is in fact
“Austro-Asiatic”, at least with respect to its traditional language.
20
The Ethnologue (Eberhard et al., 2021) lists two forms of Musahar, one a dialect of Maithili (Indo-Aryan), the other
an alternative name of the Indo-Aryan language Musasa of Nepal, and Musahari, a dialect of Bhojpuri (Indo-Aryan);
the Glottolog (Hammarström et al., 2021) contains an entry for Musahari, but only as a dialect of Bhojpuri. No mention
is made in either source of a Munda language with this name.
21
This discussion is in fact considerably more complex than described here. Cf. e.g. Narasimhan et al. (2019) and
Shinde et al. (2019) with respect to the Indus-Periphery and its role in the formation of ANI and ASI.
15
In sum: The data portrayed in Figures 8 and 9 above clearly indicate a linguistic division cutting
right through Uttar Pradesh which undoubtedly has to do with ethnic groups speaking different
languages meeting at approximately this position in the distant past, as there are no topographic
barriers here to separate these groups and cause such a clear typological division. I argue that the
ethnicities to the east of this divide will have been Munda-speaking and other groups, living in
largely egalitarian societies with no great differences between them with respect to status and with
no single predominant language. To the west of this divide, Indo-Aryan had become predominant
by ca. 600 BCE. The fact that an increasing number of genetic studies are producing evidence that
members of different “low-caste” or tribal groups in this area show genetic features typical of
Austro-Asiatic speakers lends further support to this conclusion.
5 Conclusion
In the present paper I suggest that the original spread of Munda languages in India was considerably
larger than the present-day spread of this family suggests and originally included the eastern
Gangetic Plains. Although the Munda-speaking groups until recently only inhabited the South
Central Highlands, the Eastern Plateau and the northernmost Eastern Hills, there is good reason to
believe that this was not the maximal prehistoric spread of these groups.
From a linguistic point of view, an original spread of Munda restricted to these often rugged
hill regions is problematic, as it would mean that the spread of Munda languages until quite recently
was essentially restricted to residual/accretion zones, in the terminology of Nichols (1992; 1997),
i.e., relatively isolated regions of little value with respect to agriculture or trade. As Proto-Munda
speakers very likely cultivated rice and other crops (e.g. Zide & Zide, 1976), it seems highly
counter-intuitive that they would have chosen to live in remote, difficult-to-reach areas with
suboptimal conditions for this type of agriculture when they must surely have been aware of the
fertile Gangetic Plains just to the north of their habitat.
A small but growing number of genetic studies (e.g. Chaubey et al., 2008; Reich et al., 2009)
also present evidence suggesting that the genetic make-up of tribal and “low-caste” ethnic groups
in Uttar Pradesh and Bihar, who now speak Indo-Aryan languages, show similarities with speakers
of Austro-Asiatic languages, again raising the possibility that Munda-speakers were once found
throughout the eastern Gangetic Plains, before they later switched to Indo-Aryan.
Finally, the so-called “Indo-Aryan East-West Divide”, suggested in Peterson (2017) and Ivani
et al. (2021), shows a clear typological division between eastern and western Indo-Aryan languages
whose diffuse border lies in Uttar Pradesh. This is significant since the present linguistic situation
in the Gangetic Plains is a textbook example of a spread zone, with no major geological barrier
such as a mountain range which would have separated the eastern and western groups from one
another physically.
I suggest that different ethnic groups speaking different languages were found on either side of
this Indo-Aryan divide at an earlier date: Indo-Aryan languages were predominant on the western
side of this divide by 600 BCE, whereas Munda-speaking and other groups predominated in the
east, where they had presumably already settled before the first Indo-Aryan speakers arrived in
South Asia. Due to the military, technological and economic advantages of the Indo-Aryan
speakers, their languages quickly spread after this time to the eastern Gangetic Plains as well,
replacing the Munda (and other) languages there. As most of these early learners will have been
adult learners, this resulted in considerable morphosyntactic simplifications in eastern Indo-Aryan
languages which did not take place in western Indo-Aryan languages (cf. Peterson, in press),
resulting in the Indo-Aryan East-West Divide that we find today.
16
With the advance of Indo-Aryan speakers in ever larger numbers into the Gangetic Plains, it is
likely that many Munda speakers sought refuge in the hill regions to the south, as there were
undoubtedly at least occasional skirmishes with the newcomers, and along the east coast fierce
battles over the kingdom of Kalinga are known to have taken place in the 3rd century BCE.
These hill regions will certainly not have been entirely empty, with speakers of residual
languages from pre-Munda times already living there and perhaps already a few Munda groups.
Here, in their new, relatively secluded homelands, many of these Munda languages have managed
to survive up to the present, often at the expense of their new neighbors’ languages, although the
isolate Nihali and several smaller Dravidian languages have survived right up to the present. In
contrast, those Munda-speaking groups who remained in the plains will probably have switched
entirely to Indo-Aryan within a few generations. It is this smaller distribution of Munda languages,
restricted to the central and eastern hill regions, which has remained essentially unchanged until
the present, not the earlier maximal spread.
The views presented in this study are those of a language typologist who specializes in language
contact, so that the data I present here is necessarily interpreted from this perspective. Nevertheless,
I believe that the maximal prehistoric spread of Munda suggested here, with Munda speakers
previously inhabiting large swaths of the eastern Gangetic Plains, is compatible not only with
general principles of language contact and areal typology, but also with the presumed agricultural
status of Proto-Munda speakers, with findings from archaeology, as well as with an increasing
number of genetic studies of ethnic groups of the Gangetic Plains.
Acknowledgments
I wish to thank the German Research Council (DFG) for a generous grant which allowed me to
conduct this research as well as the research cited here as Peterson and Baraik (in press), Ivani et
al. (2020), Paudyal and Peterson (2020) and Peterson (in press) within the project Towards a
Linguistic Prehistory of Eastern Central South Asia (and Beyond), as well as for the Cluster of
Excellence ROOTS - Social, Environmental and Cultural Connectivities of Past Societies, to
which I belong, which provided a stimulating research environment for this work.
6 Literature
Abbi, Anvita. 2009. Is Great Andamanese genealogically and typologically distinct from Onge and Jarawa? Language
Sciences 31(6). 791812. https://doi.org/10.1016/j.langsci.2008.02.002
Anderson, Gregory D.S. 2015. Overview of the Munda languages. In Jenny & Sidwell (eds.), 364-414.
Bryant, David & Vincent Moulton. 2004. Neighbor-Net: An agglomerative method for the construction of
phylogenetic networks. Molecular Biology and Evolution 21/2: 255265.
Chaubey, Gyaneshwer, Mait Metspalu, Monika Karmin, Kumarasamy Thangaraj, Siiri Rootsi,
Juri Parik, Anu Solnik, Deepa Selvi Rani, Vijay Kumar Singh, B. Prathap Naidu, Alla G. Reddy, Ene Metspalu, Lalji
Singh, Toomas Kivisild & Richard Villems. 2008. Language Shift by Indigenous Population: A model genetic
study in South Asia. International Journal of Human Genetics 8/1-2: 41-50.
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.526.9968&rep=rep1&type=pdf
Chaubey, Gyaneshwer, Rakesh Tamang, Erwan Pennarun, Pavan Dubey, Niraj Rai, Rakesh Kumar Upadhyay,
Rajendra Prasad Meena, Jayanti R. Patel, George van Driem, Kumarasamy Thangaraj, Mait Metspalu & Richard
Villems. 2017. Reconstructing the population history of the largest tribe of India: the Dravidian speaking Gond.
European Journal of Human Genetics 25, 49349. DOI: 10.1038/ejhg.2016.198
17
Das, Nayan Jyoti. 2020. History of origin of the Santals of India. Journal of Xi’an University of Architecture &
Technology, 12:5. 1222-1226. https://doi.org/10.37896/JXAT12.05/1520
Diffloth, Gérard. 2005. The contribution of linguistic palaeontology to th homeland of Austro-Asiatic. Laurent Sagart,
Roger Blench & Alicia Sanchez-Mazas (eds.), The peopling of East Asia: Putting together archaeology, linguistics
and genetics. London: Routldege Curzon. 77-80.
Eberhard, David M., Gary F. Simons, and Charles D. Fennig (eds.). 2021. Ethnologue: Languages of the World.
Twenty-fourth edition. Dallas, Texas: SIL International. Online version: http://www.ethnologue.com (Accessed
21 August, 2021).
Epps, Patience. 2020. Amazonian linguistic diversity and its sociocultural correlates. In: Mily Crevels & Pieter
Muysken (eds.), Language dispersal, diversification and contact: A global perspective. Oxford: Oxford University
Press. 275-290.
Hammarström, Harald, Robert Forkel, Martin Haspelmath & Sebastian Bank. 2021.
Glottolog 4.4. Leipzig: Max Planck Institute for Evolutionary Anthropology.
http://glottolog.org (Accessed on 2021-08-21).
Heggarty, Paul. 2014. Prehistory through language and archaeology. In Claire Bowern & Bethwyn Evans (eds.),
Routledge handbook of historical linguistics, 598626. London: Taylor and Francis.
Ivani, Jessica, Netra Prasad Paudyal, John Peterson. 2021. A house divided? Evidence for the East-West Indo-Aryan
divide and its significance for the study of northern South Asia. In J. Ivani & J. Peterson (eds.). Special issue of
Journal of South Asian Languages and Linguistics.
Jenny, Mathias & Paul Sidwell (eds.), The handbook of Austroasiatic languages [Grammars and Language Sketches
of the World’s Languages, Mainland and Insular Southeast Asia]. Leiden & Boston: Brill.
Jenny, Mathias, Tobias Weber & Rachel Weymuth. 2015. The Austroasiatic Languages: A Typological Overview. In
Jenny & Sidwell (eds.), 13143.
Kingwell-Banham, Eleanor, Emma Karoune nee Harvey, Rabindra Kumar Mohanty & Dorian Q. Fuller. 2018.
Archaeobotanical investigations into Golbai Sasan and Gopalpur, two Neolithic-Chalcolithic settlements of
Odisha. Ancient Asia, 9/5: 1-14. DOI: https://doi.org/10.5334/aa.164
Kuiper, Franciscus Bernardus Jacobus. 1948. Proto-Munda words in Sanskrit. [Verhandeling der Koninklijke
Nederlandse Akademie van Wetenschappen, Afd. Letterkunde, Nieuwe Reeks, Deel LI, No. 3.] N.V. Noord-
Hollandsche Uitgevers Maatschappij.
Kulke, Hermann & Dietmar Rothermund. 1991. A History of India. Calcutta/Allahabad/Bombay/Delhi: Rupa.
Kumar, Vikrant, Arimanda N.S. Reddy, Jagedeesh P. Babu, Tipirisetti N. Rao, Banrida T. Langstieh, Kumarasamy
Thangaraj, Alla G. Reddy, Lalji Singh & Battini M. Reddy. 2007. Y-chromosome evidence suggests a common
paternal heritage of Austro-Asiatic populations. BMC Evolutionary Biology, 7/47. DOI: 10.1186/1471-2148-7-47.
Metspalu, Mait, Mayukh Mondal & Gyaneshwer Chaubey. 2018. The genetic makings of South Asia. Current Opinion
in Genetics & Development, 53: 128-133.
https://doi.org/10.1016/j.gde.2018.09.003
Narasimhan, Vagheesh M., Nick Patterson, Priya Moorjani, et al. 2019. The Formation of human populations in South
and Central Asia. Science, Vol. 365, Issue 6457, eaat7487: 1-15. DOI: 10.1126/science.aat7487
Nichols, Johanna. 1992. Linguistic Diversity in Space and Time. Chicago: Chicago University Press.
Nichols, Johanna. 1997. Modeling Ancient Population Structures and Movement in Linguistics. Annual Review of
Anthropology 26. 359384.
Paudyal, Netra Prasad and John Peterson. 2021. How one language became four: The impact of different contact-
scenarios between “Sadani” and the tribal languages of Jharkhand. In J. Ivani & J. Peterson (eds.). Special issue
of Journal of South Asian Languages and Linguistics. https://doi.org/10.1515/jsall-2021-2028
Peterson, John. 2017. “Fitting the pieces together. Towards a linguistic prehistory of eastern-central South Asia (and
beyond).” Journal of South Asian Languages and Linguistics, 4/2: 211-257.
https://doi.org/10.1515/jsall-2017-0008
Peterson, John. 2018. “Towards a linguistic prehistory of eastern-central South Asia (and beyond).” Keynote speech
at the 39th Conference of the Linguistic Society of Nepal. Tribhuvan University, Kathmandu, Nepal. November
29, 2018.
Peterson, John. in press. A sociolinguistic-typological approach to the linguistic prehistory of South Asia - Two case
studies. Language Dynamics and Change.
Peterson, John. forthcoming. Mountains, plains, rivers and social (in)equality the linguistic perspective.
Rau, Felix and Paul Sidwell. 2019. The Munda maritime hypothesis. Journal of the Southeast Asian Linguistics Society
JSEALS 12:2: 35-57. http://hdl.handle.net/10524/52454
Reich, David, Kumarasamy Thangaraj, Nick Patterson, Alkes L. Price & Lalji Singh. 2009. Reconstructing Indian
population history. Nature 461. 489495.
18
Risley, H.H. The tribes and castes of Bengal. Two volumes. Calcutta: Bengal Secretariat Press. [Reprint: Calcutta: P.
Mukerjee].
Shinde, Vasant, Vagheesh M. Narasimhan, Nadin Rohland, Swapan Mallick, Matthew Mah, Mark Lipson, Nathan
Nakatsuka, Nicole Adamski, Nasreen Broomandkhoshbacht, Matthew Ferry, et al. 2019. An ancient Harappan
genome lacks ancestry from steppe pastoralists or Iranian farmers. Cell 179(3). 729735.
Sidwell, Paul. 2015. Austroasiatic Classification. In Jenny & Sidwell (eds.), 144-220.
Sidwell, Paul & Felix Rau. 2015. Austroasiatic comparative-historical reconstruction: An overview. In Jenny &
Sidwell (eds.), 221-363.
Stroński, Krzysztof & Saartje Verbeke. 2020. Shaping modern Indo-Aryan isoglosses. Poznań Studies in
Contemporary Linguistics 56/3: 529-552.
Trudgill, Peter. 2011. Sociolinguistic typology. Social determinants of linguistic complexity. Oxford: Oxford
University Press.
Wikipedia. Geography of India. https://en.wikipedia.org/wiki/Geography_of_India
Wikipedia. Substratum in the Vedic language:
https://en.wikipedia.org/wiki/Substratum_in_Vedic_Sanskrit#cite_note-35
Witzel, Michael. 1999. Substrate languages in Old Indo-Aryan (gvedic, Middle and Late Vedic). Electronic Journal
of Vedic Studies. 5(1). 167.
Zide, Arlene R.K. & Norman H. Zide. 1976. Proto-Munda cultural vocabulary: evidence for early agriculture.
Austroasiatic Studies 2, edited by Philip N. Jenner, Laurence C. Thompson and Stanley Starosta. [Oceanic
Linguistics Special Publications, 13]. Honolulu: University Press of Hawaii. 1295-1334.
Zide, Norman H. 1969. Munda and non-Munda Austroasiatic languages. In Thomas Sebeok (ed.), Current Trends in
Linguistics 5, 411430. The Hague: Mouton.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
How did languages spread across the globe? Why do we sometimes find large language families, distributed over a wider area, and sometimes clusters of very small families or language isolates (i.e. languages without known relatives)? What was the role of agriculture in language spread? What do different language ideologies and patterns of ethnic identity formation contribute? What influence do geography and climate have?The availability of increasingly large databases and new analytical research techniques make it possible to provide new answers to these long standing questions. This book focuses on patterns of language dispersal, diversification, and contact in a global perspective by comparing the complex language and population histories of Island Southeast Asia/Oceania, Africa, and South America in terms of history and patterns of settlement, conceptions of ethnicity, and communication strategies. These three regions were selected because they show interesting contrasts in the distribution of languages and language families.
Article
Full-text available
The present study compares two Indo-Aryan languages, Sadri and Konkani, with respect to their morphological complexity. Based on assumptions made in sociolin-guistic typology (e.g., Trudgill, 2011), which forms part of a larger research program investigating the effects of social factors on language structures, this study attempts to reconstruct various aspects of prehistoric society based on the structures of these two modern languages as typical representatives of eastern and western Indo-Aryan, respectively. The results suggest that 2,000-2,500 years ago eastern and western Indo-Aryan languages were spoken in very different sociolinguistic environments, with a high degree of ethnic and linguistic diversity in eastern India and a comparatively low level of diversity in the west. The study also confirms the results of other studies which suggest that different areas of grammar, such as nominal and verbal systems, may be affected to different degrees in language contact and that their respective rates of (re)complexification may also differ.
Article
Full-text available
In this study, we investigate the possible presence of an east–west divide in Indo-Aryan languages suggested in previous literature (Peterson, John. 2017a. Fitting the pieces together – towards a linguistic prehistory of eastern-central South Asia (and beyond). Journal of South Asian Languages and Linguistics 4(2). 211–257.), with the further hypothesis that this divide may be linked to the influence of the Munda languages, spoken in the eastern part of the subcontinent. Working with 217 fine-grained variables on a sample of 27 Indo-Aryan and Munda languages, we test the presence of a geographical divide within Indo-Aryan using computational methods such as cluster analysis in combination with visual statistical inference. Our results confirm the presence of a geographical divide for the whole dataset and most of the individual features. We then proceed to compute the degree of similarity between the Indo-Aryan languages and Munda, using a Bayesian alternative to a t-test. The results for most features support the claim that the languages identified in the eastern clusters are indeed more similar to Munda, thereby opening up further research scenarios for the history of this region.
Article
Full-text available
Four Indo-Aryan linguistic varieties are spoken in the state of Jharkhand in eastern central India, Sadri/Nagpuri, Khortha, Kurmali and Panchparganiya, which are considered by most linguists to be dialects of other, larger languages of the region, such as Bhojpuri, Magahi and Maithili, although their speakers consider them to be four distinct but closely related languages, collectively referred to as “Sadani”. In the present paper, we first make use of the program COG by the Summer Institute of Linguistics (SIL) to show that these four varieties do indeed form a distinct, compact genealogical group within the Magadhan language group of Indo-Aryan. We then go on to argue that the traditional classification of these languages as dialects of other languages appears to be based on morphosyntactic differences between these four languages and similarities with their larger neighbors such as Bhojpuri and Magahi, differences which have arisen due to the different contact situations in which they are found.
Article
Full-text available
Ancient human movements through Asia Ancient DNA has allowed us to begin tracing the history of human movements across the globe. Narasimhan et al. identify a complex pattern of human migrations and admixture events in South and Central Asia by performing genetic analysis of more than 500 people who lived over the past 8000 years (see the Perspective by Schaefer and Shapiro). They establish key phases in the population prehistory of Eurasia, including the spread of farming peoples from the Near East, with movements both westward and eastward. The people known as the Yamnaya in the Bronze Age also moved both westward and eastward from a focal area located north of the Black Sea. The overall patterns of genetic clines reflect similar and parallel patterns in South Asia and Europe. Science , this issue p. eaat7487 ; see also p. 981
Article
Full-text available
The Gond comprise the largest tribal group of India with a population exceeding 12 million. Linguistically, the Gond belong to the Gondi-Manda subgroup of the South Central branch of the Dravidian language family. Ethnographers, anthropologists and linguists entertain mutually incompatible hypotheses on their origin. Genetic studies of these people have thus far suffered from the low resolution of the genetic data or the limited number of samples. Therefore, to gain a more comprehensive view on ancient ancestry and genetic affinities of the Gond with the neighbouring populations speaking Indo-European, Dravidian and Austroasiatic languages, we have studied four geographically distinct groups of Gond using high-resolution data. All the Gond groups share a common ancestry with a certain degree of isolation and differentiation. Our allele frequency and haplotype-based analyses reveal that the Gond share substantial genetic ancestry with the Indian Austroasiatic (ie, Munda) groups, rather than with the other Dravidian groups to whom they are most closely related linguistically.European Journal of Human Genetics advance online publication, 1 February 2017; doi:10.1038/ejhg.2016.198.
Article
We report an ancient genome from the Indus Valley Civilization (IVC). The individual we sequenced fits as a mixture of people related to ancient Iranians (the largest component) and Southeast Asian hunter-gatherers, a unique profile that matches ancient DNA from 11 genetic outliers from sites in Iran and Turkmenistan in cultural communication with the IVC. These individuals had little if any Steppe pastoralist-derived ancestry, showing that it was not ubiquitous in northwest South Asia during the IVC as it is today. The Iranian-related ancestry in the IVC derives from a lineage leading to early Iranian farmers, herders, and hunter-gatherers before their ancestors separated, contradicting the hypothesis that the shared ancestry between early Iranians and South Asians reflects a large-scale spread of western Iranian farmers east. Instead, sampled ancient genomes from the Iranian plateau and IVC descend from different groups of hunter-gatherers who began farming without being connected by substantial movement of people.
Article
This study summarizes preliminary research into the distribution of morphosyntactic patterns in the languages of South Asia from three different families, above all in eastern-central South Asia, in a first attempt to unravel the linguistic prehistory of this part of the subcontinent. To achieve this goal a small, preliminary morphosyntactic database has been compiled on 29 languages from throughout South Asia based on data from published resources, original field work, as well as questionnaires sent out to researchers working on a number of languages from the region. This data base, although still quite limited, will serve as the starting point for a much larger, finer-grained analysis of languages from throughout the subcontinent which will ultimately contribute substantially to our knowledge of the linguistic prehistory of this region.