Wednesday, April 17, 2019

Ejective and Implosive Theories of Proto-Indo-European Consonants

I’ve finally gotten around to reading Allan Bomhard’s 2016 paper on the ejective theory of PIE obstruents, which offers a renewed defence of the idea that the stops usually reconstructed as plain and voiced, like *d, actually go back to ejective stops, *t'. Since the subtitle of his paper is 'reigniting the dialog', I figured writing out my thoughts wouldn't hurt. Looking at the state of the discussion these days, especially in light of the many variants of ejective and glottalic theories that have piled up over the years, it seems to me that there are a few points that are worth restating (or at the very least writing through for my own benefit), especially on the phonetics and phonologies involved. None of what I have to say is really new, and there are important works by Barrack, Haider, Salmons, Weiss, and others that have definitely gone into this post (not to mention a number of grammars of particular languages with ejective and implosive consonants that I've looked at over the course of thinking about this problem), even though I didn't find a convenient point to mention them specifically. Maybe if I have the time and energy in the future I'll expand this and give it better references. I've divided this post into seven points, roughly leading into one another, but dealing with different aspects of phonology and reconstruction.

1) One of the more popular variants of ‘glottalic’ theory is the Leiden school’s approach (Kloekhorst 2016, pp. 232-235 gives a recent summary with extensive references). This is usually stated in phonetic, not phonological terms, as involving a ‘Proto-Indo-Anatolian’ series of ‘preglottalized stops’, phonetically voiceless and short: *[ˀt] (using the dentals for illustration). This should probably be classed phonologically as a type of ejective, /tʼ/. Assuming a bifurcated family tree, with Anatolian on one side and the rest of IE on the other, this approach posits a change to phonetically voiced preglottalized stops, *[ˀd]. Phonologically, such a stop would seem to belong to a category of consonants including implosives and creaky/laryngealized stops. Phonetically there is a continuum of realizations here, but phonologically there is, to my knowledge, never a contrast within this spectrum (see, among others, Lombardi 1991Clements and Osu 2002, Ladefoged and Maddieson 1996, p. 53; Kehrein and Golston 2004, p. 330; for a classic phonetician’s discussion, see Ladefoged 1968, pp. 16-17). For PIE, I will refer to this group of sounds all as implosives as a cover term of convenience, and notate them as /ɗ/, without implying anything in particular in terms of either phonetic realization or featural structure (though 'voice' plus 'constricted' or 'glottalized' would be standard phonological characterizations). I think it’s important to think about this from a phonological perspective, because the putative evidence for preglottalization (all of which, it should be said, is disputed) comes largely from syllable codas, a position in which a preglottal allophone of an implosive might be expected. The main point here is that even if we accept the Leiden position, we probably should not make a habit of talking about ‘preglottalized’ stops in general, as if this were their phonemic specification or universal realization, since that may well involve an inappropriate generalization of one, possibly marked allophone to all positions.

2) Is there much serious defence today of ejectives specifically for the immediate ancestor of the majority of the IE languages, the way the original proposals of the 'glottalic' model envisioned (see, e.g., Gamkrelidze & Ivanov 1995, p. 45ff.)? Details vary, but most discussions now posit a change of earlier ejectives to something else, something voiced, as a common innovation to the precursor to many, most, or all IE languages before they fully diverged as separate branches. This is the argument of Bomhard (who sees this in 'Disintegrating PIE') and the Leiden school (who see it as a major innovation of non-Anatolian IE). Kümmel’s important 2012 paper proposes that ‘early PIE’ had implosive (phonetically, non-obstruent) stops. The popularity of this position is easy to understand. Studies like Fallon's work on ejectives have, even while ostensibly arguing for the diachronic possibility of voicing ejectives, demonstrated rather effectively that this change is not nearly usual enough that we should comfortably assume that a change of */t’/ to *[d] took place separately in six of the ten known major branches of Indo-European. Hence the desire among those envisioning original ejectives to suppose a shift at least to some intermediate stage like */t’/ to */ɗ/ (however this was realized phonetically), if not already to */d/, as a shared innovation. (Since shared innovations can spread across diversifying dialects, so the relevance of this idea to subgrouping and the phylogenetic structure of the IE family is limited.) While the details differ (and not just in superficial ways), some sort of (partial?) shared shift would seem to be a part most ‘glottalic’ approaches today, and should probably now be assumed as a standard part of  any ‘glottalic theory’ unless specified otherwise.

3) On a methodological note, rather than phonological one, this shared innovation is important, since one of the typological features meant to be explained by an appeal to ejectives, the rarity of ‘*b’, only works for ejectives proper (insofar as it is compelling; this argument is perhaps not quite as strong typologically as it’s sometimes made out to be), not for implosives (implosives favour labial articulations). Among other things, we should probably recognize that the points often cited together as evidence for a single ‘glottalic’ theory are in fact often pointing to different layers in a proposed sequence of development. Much of the Leiden school’s arguments for non-Anatolian PIE *[ˀd] work on a different level from the typological arguments for PIA *[ˀt] (which draw on cross-linguistic comparisons with ejectives, implying phonological */t’/). This is not an objection to these sorts of approaches, but methodologically I think it’s very important to recognize the chronological mismatch between these lines of argumentation.

4) The reconstruction of */ɗ/ widely in IE also has a further consequence for reconstruction: the reconstruction of */t’/ at any stage, rather than just /ɗ/ all the way back, now depends very crucially on one’s view of the importance of typological arguments specifically in explaining the rarity/lack of ‘*b’. As has been noted many, many times before, this gap could have other explanations, such as a late pre-PIE change altering */b/ or */ɓ/ to something else, like *w and/or *m: one could suggest that in pre-PIE */ɓ/ was very common (in keeping with typological norms), and its absence is simply a quirk of sound change. If I’ve read him correctly, Kümmel in fact does not suggest an earlier */t’/ stage (though I don’t think his arguments exclude such a thing either). This is part of why some have, quite rightly, suggested abandoning the term 'glottalic' theory entirely, and instead always being explicit: do we mean an ejective theory, an implosive theory, an ejective-to-implosive theory, or something else entirely? This kind of precision might be cumbersome, but the term 'glottalic' probably now evokes too many possible options to be all that useful.

5) The position of Germanic and Armenian as ‘relic areas’ should probably just be abandoned at this point. Under no reasonable view of the patterns of innovation within IE are these outlying dialects, and it hardly seems plausible to argue that they’d esecape any major change to an old ejective series. It’s having our cake and eating it too, if we propose that there was a shared innovation in order to get around the problem of general voicing in 'Inner' PIE (an implosive or voicing theory), while also invoking a supposedly simpler version of the Germanic and Armenian consonant shifts as evidence for an ejective theory of PIE. In any case, in terms of explanatory power, an ejective model really offers little compared to traditional accounts, which can explain these consonant shifts just fine.

6) Back to phonological typology, the most significant typological issue with the traditional reconstruct is and remains not the 'plain voiced' series, but the murmured ('breathy voice', 'voiced aspirated') stops like *dʰ, which I'm writing /d̤/ (/dʱ/ is an equivalent notational convention, as far as phonology is concerned; actually the traditional notation /dʰ/ is not really problematic, despite its absence from the IPA, since the relevant phonological feature, now often described as 'spread', is the same as in voiceless aspirates, /tʰ/), but it still looks like there’s little consensus on how to handle these between different ‘glottalic’ models. Kümmel relates the development of murmur specifically with the shift of implosives to plain voiced stops, in a kind of push-shift. Bomhard also sees murmur as arising in ‘disintegrating’ IE, and ancestral at least to Greek, Armenian, Italic, and Indo-Iranian. No matter what we do, we’re going to run into the same essential conflict that’s been at the heart of this debate for half a century now: do we avoid reconstructing */d̤/ (at the cost of having to posit some usual sound changes in many particular branches), or do we reconstruct */d̤/, and just accept that some stage of (post-)PIE had a typologically unusual system? Postponing the question to ‘late PIE’ doesn’t really make this tension go away, except insofar as a brief transitional system might perhaps be more tolerable than an age-old one. I'm not sure about other people, but for me it's the idea of abandoning wholly the murmured series that's the real sticking point with things like the Leiden model, rather than the 'glottalic' part of the model as such.

7) The whole 'problem' with the murmured stops arose because of the ‘removal’ of voiceless aspirates from the PIE phonemic system. Actually it's occasionally been suggested that the Neogrammarians were right, and voiceless aspirates like */tʰ/ were part of the PIE phoneme inventory. This might not end the debate about ejective or implosive consonants entirely, but it would put the whole question on a very different typological footing: in particular, it would make the assumption of the murmured series unproblematic. But most voiceless aspirates are so obviously secondary that there’s a natural reluctance to see them as phonemic. Maybe (this is a point I first heard made by Mark Hale) the more important question is: does this matter? The significance of the ‘phoneme’ depends a lot on precisely what theoretical phonological paradigm we’re using, and it’s a potentially defensible position to maintain that significant surface allophones are what we should really care about here. If we have sequences like */th₂e/ -> *[tʰa] as a real part of the phonological (not phonemic) structure of the language, then we might not face nearly as compelling a typological objection to 'voiced aspirates'. How to formalize this will vary. One possibility is that the assimilation process relies on adding the feature +SPREAD to the voiceless stop, incresaing the phonological salience of this feature, allowing its use in the underlying specification of voiced segments. The surface space would certainly be fleshed out with *[t, d, tʰ, dʱ=d̤]. Such sequences might not have been frequent, but they probably did not need to be: 'voiced aspirates' are much more common than voiceless in Sanskrit, for instance. Not every phonologist will like that kind of unbalanced feature specification, but it’s a potential criticism of phonological theory in general that perhaps too much attention has been paid to economy and symmetry at the underlying level, and not enough to surface or mid-level structures. Of course, even if one does accept this idea, its exact bearing on the ‘glottalic’ question is not straightforward. There's also a question of chronology. The clearest evidence for *[tʰ] allophones comes from Greek and Indo-Iranian, which are also two of the branches where there’s the greatest need for murmur, and which are often thought to be in some sort of inner-IE dialect area (not subgroup). Perhaps the rise of voiceless aspirates in an inner area could be a part of a dialectal rise of ‘voiced aspirates’.

As I said, none of these points are really new, and, this being a blog post, and other work staring at me accusingly as I take too long writing this already, I haven’t given all the references I should. There are also interesting suggestions, such as the possibility that murmur was not a segmental property, but a suprasegmental feature of roots, that I haven't touched on. These are just the main points that struck me as most in need of restating, given where literature on the subject seems to stand now.

Tuesday, March 26, 2019

The History of English in a Hundred Words

Well, I never did keep up with this blog the way I meant to. I mostly blame getting into the last year of my doctorate and having my thesis eat up all of my energies, and by the time that was over blogging wasn't really on my mind. It didn't help that I was mostly inspired by a specific congruence of publications about the Indo-European 'homeland' problem, and so didn't really go into it with any long-term goals or plans.

Now, a few years on, I'd like to return to writing some more semi-formal things, focusing on a series of posts on the history of the English language. Each post will take one word as a hook, and use it to talk about some topic related to the development of English, going in roughly chronological order from the origins of language up to the present. (This isn't an entirely original idea, though I actually wasn't consciously aware of David Crystal's The Story of English in 100 Words when I outlined this project; having looked at Crystal's book, I think I'm still in no danger of redundancy, since our approaches are very different.)

One of the things I'd like to do with this series is shake up the usual idea that the history of English should focus especially on the development of 'English' as a separate language, starting with Old English and only including earlier periods as a kind of preface or background. If all goes as planned, I'll get to Old English in due course, but not for a very long time: roughly speaking, Old English might be said to begin around some 1500 years ago, and there's a good 3500 years or more before that of linguistic prehistory that we can talk about in some substantive way.

In any case, this will certainly be a longer project, and I'm hoping to post on a semi-regular basis. I've created a new platform so the series can have a dedicated home, uninterrupted by any unrelated content I may choose to post (not that I have any concrete plans for other posts at the moment, but I don't want to nail down the coffin lid here) -- -- but I'll also copy the posts here as well.

Friday, June 26, 2015

Indo-European Origins

2015 isn't even half over, but we've already seen a flood of high-profile papers and books weighing in on the question of the 'Indo-European homeland'. It will probably take a while for everything to sink in and get properly digested, but here are my preliminary reactions to what I've read so far.

For those who don't spend their spare time thinking about Avestan verbal conjugations, the basic issue is about where a particular ancient language called Proto-Indo-European (usually abbreviated to PIE) was spoken. We have no direct records of this language, but reconstruct on the basis of a number of ancient and modern languages that seem to have developed from it. This is sort of like how French, Spanish, Italian, and the other Romance languages all developed out of Latin, only in this case it's as if Latin were never written down so that we have to figure out what it was like by comparing the later languages. Some of the more familiar early Indo-European languages include Greek, Latin, and Sanskrit, but the family is very large and includes languages as diverse as English, Lithuanian, Kurdish, and Albanian. People have been working on reconstructing PIE for a couple of centuries now, so that the method of reconstruction is pretty advanced and we really know quite a lot about the language (though there are also some very important points of debate too).

(Note that PIE isn't some sort of primordial language. Humans have been speaking for tens of thousands of years, and PIE is very far removed from the first human language(s). It's also not the only reconstructed proto-language in Eurasia. We also have Proto-Afro-Asiatic (including Proto-Semitic, the ancestor of Babylonian, Hebrew, Arabic, and others), Proto-Uralic (whose most famous descendants are Finnish and Hungarian), Proto-Kartvelian (from which a number of languages in the Caucasus come), and others -- and of course there are more yet around the rest of the world. But PIE is a particularly interesting proto-language, since its descendants include a large number of interesting ancient and modern languages, and its study can shed light on the prehistory of Europe and other regions not available from any other source.)

Naturally, people have often been curious about when and where PIE was spoken, and a lot of possibilities have been brought up over the years. There isn't really an obvious answer, since even the oldest Indo-European languages are found across an enormous area: from Greek in Greece to Sanskrit in India to Tocharian in Western China (!), to the Baltic, Germanic, and Celtic languages across northern Europe. This breadth not only makes it hard to figure out where the original language came from, but raises the question of just how these languages got spread so far and wide. Nowadays only two 'homelands' really receive much attention, the 'Anatolian' and 'Steppe' hypotheses. They differ on not only where PIE was spoken, but also when, and how it spread. Here they are in brief:

1) The Anatolian origin places PIE in Anatolia (Turkey, basically) at a very early date, around 7000-6000 BC or so. Around this time, farming was spreading from this area into Europe (which had only been inhabited by hunter-gatherers until this point), and the idea is that as early farmers settled in slow waves from Anatolia, they spread their language as well as their way of life into these areas. This idea was first laid out by the archaeologist Colin Renfrew in his 1987 book Archaeology and Language: The Puzzle of the Indo-Europeans, which remains an interesting read.

2) The Steppe hypothesis looks to the Eurasian steppe, especially in the region that's now the Ukraine, around 4000-3000 BC. This is over 1000 miles away and on the other side of the Black Sea from Anatolia, as well as several millennia later. The Steppe hypothesis sees the spread of the IE languages as a somewhat more complicated process, but points especially to the invention of wheeled vehicles in this area at this time. The notion is that PIE culture was mobile and competitive, and for various cultural and economic reasons was well-placed to spread rapidly -- partly through the actual migration of people, but also partly by assimilating other people into their lifeways (including language). The modern form of this hypothesis goes back to Marija Gimbutas in the 1950's, but it's been developed by a whole host of people since.

Both hypotheses have been argued at length for a number of years now, but something new has been added to most of the main points during the past few months.

Statistical Dating of Proto-Languages

One of the first big items of the year was the publication by a team from UC Berkeley of the rather technically titled article Ancestry Constrained Phylogenetic Analysis Supports the Indo-European Steppe Hypothesis. The topic of this paper is an approach that goes back to 2003, when a couple of New Zealand biologists tried to apply statistic methods from evolutionary biology to date when PIE began to split up into its various descendent languages. The New Zealanders had found that PIE was very old (7800-5800 BC), a date that was more in keeping with the Anatolian hypothesis rather than the Steppe. There was a major follow-up in 2012, which claimed to find the same thing. Now the Berkeley team has responded on the other side of the debate, claiming that if you use a better methodology, the dates actually end up being much more recent, in keeping with the Steppe rather than the Anatolian hypotheses.

The basic method behind all of these studies is to look at the replacement of basic vocabulary words. The idea is that words for things get replaced from time to time -- the Old English word wamb*, for instance, has been replaced in modern English by belly, though other basic words like hand and night have stuck around (though sometimes they've undergone changes of pronunciation). We know that the rate at which this sort of replacement happens can be extremely variable, but this approach rests on the (questionable) notion that a sufficiently advanced statistical model can still get at least some idea of how long ago related languages split, based on how much their basic vocabulary has diverged.

*[Edit: I should probably specify that wamb is still around in English, in the specialized meaning 'womb'. But it's been replaced as a part of the basic vocabulary by belly, which is all that this sort of lexical statistical model cares about.]

The New Zealanders have been arguing that the early datings by these statistical models provide substantial support for the Anatolian theory, but a lot of people have criticized all aspects of their study. This new paper by the Berkeley team is the first major attempt to actually use their own methods to obtain a different result. The key difference is in the 'ancestry constrained' part of the title. Basically, the New Zealanders had just fed data from a bunch of Indo-European languages in, and let the computer figure out how they were related -- it didn't say anything, for instance, about Latin being an ancestor of Italian, and in fact their model doesn't have Latin as ancestral to the Romance languages. It puts it more as an aunt or uncle. The Berkeley folks figured they'd try putting in these 'ancestry constraints', and tell the computer that the Romance languages come from Latin (and that Modern Irish comes from Old Irish, etc.). When they did this, the age of every part of the family, including PIE itself, came out as more recent.

This is probably because the New Zealand model was having to produce more prehistoric language stages. The Berkeley model has the Romance languages coming from Latin, which in turn comes from Proto-Italic, and that from PIE. The New Zealand approach reconstructs a Proto-Romance, which is significantly different from Latin, and so needs a sort of Proto-Latino-Romance stage that's older than Latin, which means the age of Proto-Italic also gets pushed older, etc. When this happens all over the family tree, the average age of PIE can ultimately be pushed back quite significantly. By eliminating this effect (which they call 'jogging'), and making a few other technical changes, the Berkeley team got dates ranging from 5100-2800 BC. This encompasses all of the Steppe dates, but is too late for the Anatolian origin -- the farmers had already left Turkey.

Like most linguists, I'm pretty sceptical about all this, whatever the conclusions. Lexical data is among the least reliable in language change, for a variety of reasons, and it's hard to see how the best statistical model in the world could get useful results from nearly useless data. The Berkeley team claims that their study shows that statistical models can be useful, and that theirs actively supports the Steppe hypothesis. My own feelings are a bit different: I'd suggest that this study basically makes these statistical models irrelevant. Proponents of either origin can now point to or disregard statistical studies as they wish. Since most linguists have been happily disregarding them already, they'll probably just keep on doing so. The actual evidence comes from other sources.

Traditional Arguments Restated

My favourite paper on all this to appear this year is a well-written piece by archaeologist David Anthony and linguist Don Ringe, titled The Indo-European Homeland from Linguistic and Archaeological Evidence. This paper tries to present the strongest possible arguments in favour of the steppe hypothesis, and it's hard to think of any two authors better qualified to do so.

David Anthony is not new to this area of study. His 2007 book The Horse, the Wheel, and Language is basically an extended archaeological argument for the Steppe origin, and is probably the most up-to-date thing of its kind. Building off of earlier work like Jim Mallory's In Search of the Indo-Europeans and ultimately Marija Gimbutas's idea of the 'Kurgan hypothesis', Anthony has tried hard to put together a coherent picture of what the Steppe origin might have looked like in detail.

This paper says little that's fundamentally new, but it tries to state the traditional arguments in the most rigorous way possible. This is helped by the presence of Don Ringe, an Indo-European linguist who has the expertise to make the linguistic case in a coherent way.

The biggest emphasis in this paper is an approach that's sometimes called Wörter und Sachen (German for 'words and things'), or else linguistic palaeontology. The idea is that if we can securely reconstruct a particular word with a particular meaning for a proto-language, the implication is that the speakers of that proto-language had or knew that thing.

Older versions of the homeland debate have often focused on flora and fauna, since a word for a rarer animal or plant might help pinpoint where the PIE speakers were. This approach hadn't worked real well, mainly because most of the reconstructible words of this sort are for fairly widespread things, like beavers and wolves. This makes sense: any really specific word would have been lost or changed meaning in most branches of Indo-European, once the speakers left the area that plant or animal lived in -- this would make it really hard to recover the original meaning.

In the current debate, the focus has been not on the natural world, but on technology, particularly (but not exclusively) wheels. This paper points out that while a word for 'wheel' is not reconstructible for PIE as such, it can be reconstructed for the next best thing. In Indo-European linguistics, the various descendent languages are grouped into ten sub-families or branches, and most people assume that the first of these families to split off and go its own way was the Anatolian family (not to be confused with the Anatolian hypothesis!). Ringe uses the term NPIE, 'nuclear Proto-Indo-European' for all the other IE languages, which continued to develop after the departure of Anatolian, and shows that words for 'wheel' (and various related words for wheeled vehicles) can be reconstructed for NPIE.

The reason why wheels are interesting is because they are relatively late, archaeologically speaking, only showing up around 4000 BC at the earliest. This seems to show pretty conclusively that NPIE was still around at this date, long after the first farmers had dispersed across Europe.

There are quite a few people who doubt that the wheel vocabulary is significant. People have argued that these words could have been borrowed around later, or been invented independently, or been old words for different things that shifted meaning. This paper does a particularly good job of addressing these alternatives, and spelling out the highly unlikely assumptions required in each case.

Anthony and Ringe also look at other parts of the (N)PIE vocabulary, focusing on words for feasting, the celebration of glory and martial prowess, leaders and followers, guest-host relationships, and the like. They paint a picture of (N)PIE society as interestingly fluid, based around competing chiefs who accumulated followers and maintained long-distance networks. The society was fairly mobile, based around stock-breeding and warfare, and tied together by people visiting and staying with each other over long distances, feasting with other, and sharing a common culture of religious ritual and poems praising successful chiefs and heroes. These things encouraged spread of the cultural system, the recruitment of new potential followers and allies, and the use of the prestigious language of these networks (if your success is partly dependent on maintaining your reputation in a particular poetic tradition, that at least encourages the further use of the language used for that poetry).

It's hard to know just how precise we can get with these sorts of cultural reconstructions, but their conclusions are plausible and really pretty restrained. They also provide a good model for how the spread of the IE languages might have worked.

They also touch on various other topics, such as contact between PIE and the Uralic languages, and how certain archaeological events might be related to language history. If you want a single, pretty concise and well-written overview of the traditional Steppe arguments, you really hardly need look further than this one piece.

The Indo-European Controversy

It was neat to see another major book on the subject appear in April, Asya Pereltsvaig and Martin Lewis's The Indo-European Controversy. It's expensive, but looks very interesting. I haven't had a chance to read it yet, but judging from the description, and from online contributions by the authors, it's pretty clear that it focuses at least in large part on dismantling the New Zealand team's computational models, and on the proper methodology of developing a theory that links linguistic reconstruction with archaeological fact.

Genetic Studies in Nature

The most recent major event in the debate comes not from archaeology, linguistics, or computational methods, but from genetics. Two pieces have just appeared in Nature arguing that the genetic population of Europe was significantly influenced by genes from the steppes (i.e. that there was a significant migration from the steppes into parts of northern Europe) sometime in the period before 3000 BC.

At first glance, this looks like pretty straightforward support for the Steppe hypothesis. People are moving in precisely the right places at the right time, from the steppe into Europe around the 4th millennium BC. Nonetheless, it's harder to prove that the language these people brought with them was PIE (or even that a single language was involved, or really what sort of patterns of linguistic shift we might expect all around). It certainly fits a lot better with the Steppe origin than the Anatolian one, though.

This is assuming that the genetic side of the studies is rigorous and reliable, which is something I don't have enough of a background in to comment on.

Final Thoughts

All in all, this has not been a good year for the Anatolian hypothesis. It's never really received much support from linguists, and has chiefly been promoted by some archaeologists (Renfrew) and evolutionary biologists (the New Zealand team of Atkinson, Gray, and others). Personally, I feel it mainly rests not so much on any real evidence, but on a feeling that for the IE languages to have spread so far and wide, we need to look for some single Big Event. The development of agrarian farming and its spread across Europe certainly provides such an event, at least for the Western IE languages, but otherwise has little to recommend it.

In fact, it really has quite a lot going against it. There's the issue of wheels, of course, and other pieces of technology associated with the 'secondary products revolution' -- the use of milk, wool, and other animal products, which only came about millennia after the initial spread of purely agrarian (crop-based) farming. Beyond this, there's also the problem that the Indo-European languages seem to be relative latecomers in places the Anatolian hypothesis predicts they should have arrived early. 

In Europe, for instance, a number of western Indo-European languages seem to have borrowed words for things like beans and peas (Guus Kroonen has done excellent recent work in gathering the evidence for this 'agricultural substrate' language or languages). The same sound correspondences that prove that the wheel words can't be later borrowings prove that these agricultural words were borrowed independently from some non-IE language into Latin, Germanic, Celtic, Baltic, and Slavic, in the process adjusting in different ways to fit into each of these already separated languages. The implication is that there were already agrarians living in Europe and speaking their own languages when the Indo-European tongues arrived. This is hardly consistent with the hypothesis that Indo-European was brought into Europe by the first wave of agriculturalists.

Beyond this, we know there were quite a few non-IE languages scattered across parts of Europe, even well into the Iron Age, which would be odd if Indo-European had so thoroughly carpeted Europe in the Stone Age. Basque is the only living remnant of these non-IE languages, but we have ample records from across the Mediterranean of languages like Etruscan (the language of an important agrarian civilization), Tartessian, and whatever's written in the still-undeciphered Linear A script (which is highly unlikely to be anything IE).* Europe seems to have been pretty linguistically diverse for a very long time, nowhere near as Indo-Europeanized as the Anatolian hypothesis might predict (given that the whole point of the theory is to explain how Europe could have gotten so thoroughly Indo-Europeanized).

*[Edit: Don Ringe has a nice discussion of all of this in a guest post for LanguageLog:]

I think there's enough evidence from Wörter und Sachen, borrowings, non-IE languages of ancient Europe, and the distribution of the Indo-European languages, along with enough doubt about the underpinnings of the theory (in its need for a single flashy linguistic vector and earlier lexicostatistical studies) to say confidently that the Anatolian hypothesis is not viable -- not just unproven, but highly unlikely to be correct.

Whether this means the Steppe origin is right is another matter. One of the problems with a polarizing debate is that evidence against one theory can be taken as positive support for another. I personally like the Steppe hypothesis a lot. It seems to put PIE in the right time period, and the place is plausible. There are archaeological connections between the steppes and several important places that the IE languages ended up, and the method of linguistic transmission strikes me as well thought out.

Still, I'm not quite sure we ought to call the Steppe hypothesis 'proven'. It's certainly plausible, and the way that it ties together a wide range of evidence quite neatly gives it some real support. But it's really hard to link an unrecorded language with material culture, and there's got to be some room for doubt. I'm not quite sure that, barring the invention of time machines, we should ever let ourselves make the last leap from 'PIE was probably spoken on the steppes' to 'PIE was spoken' there. But however it goes, it's been an interesting half-year so far, and I'm very much looking forwards to seeing what else is waiting down the road.