Sunday, January 23, 2022

When did the Western Steppe Herder genetic profile form?

A few days ago a rather exciting pre-print of an upcoming article by Manjusha Chintalapati, Nick Patterson, and Priya Moorjani was published on Note that the pre-print has not gone through peer review yet, and the final version may have changes.

Reconstructing the spatiotemporal patterns of admixture during the European Holocene using a novel genomic dating method

Manjusha Chintalapati, Nick Patterson, Priya Moorjani



Recent studies have shown that gene flow or admixture has been pervasive throughout human history. While several methods exist for dating admixture in contemporary populations, they are not suitable for sparse, low coverage data available from ancient specimens. To overcome this limitation, we developed DATES that leverages ancestry covariance patterns across the genome of a single individual to infer the timing of admixture. By performing simulations, we show that DATES provides reliable results under a range of demographic scenarios and outperforms available methods for ancient DNA applications. We apply DATES to ~1,100 ancient genomes to reconstruct gene flow events during the European Holocene. Present-day Europeans derive ancestry from three distinct groups, local Mesolithic hunter-gatherers, Anatolian farmers, and Yamnaya Steppe pastoralists. These ancestral groups were themselves admixed. By studying the formation of Anatolian farmers, we infer that the gene flow related to Iranian Neolithic farmers occurred no later than 9,600 BCE, predating agriculture in Anatolia. We estimate the early Steppe pastoralist groups genetically formed more than a millennium before the start of steppe pastoralism, providing new insights about the history of proto-Yamnaya cultures and the origin of Indo-European languages. Using ancient genomes across sixteen regions in Europe, we provide a detailed chronology of the Neolithization across Europe that occurred from ~6,400–4,300 BCE. This movement was followed by a rapid spread of steppe ancestry from ~3,200–2,500 BCE. Our analyses highlight the power of genomic dating methods to elucidate the legacy of human migrations, providing insights complementary to archaeological and linguistic evidence.

Sounds fascinating right? If you haven't read the article yet, I'd suggest that you do this first.

The focus will be on this part in particular, as it has rather major consequences for the debate of Indo-European languages:

To understand the timing of the formation of the early Steppe pastoralist-related groups, we applied DATES using pooled EHG and pooled Iranian Neolithic farmers. Focusing on the groups with the largest sample sizes, Yamnaya Samara (n=10) and Afanasievo (n=19), we inferred the admixture occurred between 40–45 generations before the individuals lived, translating to an admixture timing of ~4,100 BCE (Table S6.1). We obtained qualitatively similar dates across four Yamnaya and one Afanasievo groups, consistent with the findings that these groups descend from a recent common ancestor (for Ozera samples from Ukraine, the dates were not significant). This is also further supported by the insight that the genetic differentiation across early Steppe pastoralist groups is very low (FST ~ 0.000-0.006) (Table S6.2). Thus, we combined all early Steppe pastoralist individuals in one group to obtain a more precise estimate for the genetic formation of proto-Yamnaya of ~4,400 to 4,000 BCE (Figure 2). These dates are noteworthy as they pre-date the archaeological evidence by more than a millennium (37) and have important implications for understanding the origin of proto-Pontic Caspian cultures and their spread to Europe and South Asia.

These results were rather surprising to me, because a good friend of mine had a look into this topic a while back and his results weren't exactly the same as the results presented here. Furthermore, Yamnaya forming as a two way mixture of EHG and Iran_N is also inconsistent with the findings of Wang 2019, which concluded that Yamnaya samples had a small amount of Early European Farmer (EEF) ancestry, and the EEF ancestry could affect the dates [1]. The usage of only Iran_N as as a reference is rather peculiar as Gallego-Llorente 2016 has shown that steppe ancestry has a higher shared affinity with CHG rather than Iran_N [2], however given the few CHG samples available it is understandable.

Anyhow,  I'm going to give the stage to my mate Altvred, who some of you may know from Anthrogenica  forum:

The “Steppe” genetic profile, an approximately equal mix of Eastern European Hunter-Gatherers (EHGs) and Caucasus Hunter-Gatherers( CHGs), existed long before the 4000-4400 BC date mentioned in the preprint and the Yamnaya themselves.

The earliest samples with such an autosomal profile are dated to 4994-4802 BC (PG2001) and are from Stavropol Krai, a region of Russia in the Northern Caucasus. 

  • PG2001 Progress-2 BZNK-113/4, kurgan 1, grave 37 tooth (molar) 2019 WangNatureCommunications2019 Direct: IntCal20 6850 46 4994-4802 calBCE (6012±28 BP, MAMS-110564) .. Russia_Steppe_Eneolithic Piedmont, Progress 2 Russia 43.822691 43.350278

  • PG2004 Progress-2 BZNK-062/3, kurgan 4, grave 9 petrous 2019 WangNatureCommunications2019 Direct: IntCal20 6088 60 4240-4047 calBCE (5304±25 BP, MAMS-11210) .. Russia_Steppe_Eneolithic Piedmont, Progress 2 Russia 43.822691 43.350278 .. H2

  • VJ1001 Vonjucka-1 BZNK-311/2, kurgan 1, grave 8 petrous 2019 WangNatureCommunications2019 Direct: IntCal20 6230 41 4337-4177 calBCE (5409±24 BP, MAMS-29823) .. Russia_Steppe_Eneolithic Piedmont, Vonjucka 1 Russia 44.019962 43.155538

“Steppe Eneolithic” can be modeled as a two-mix of EHG and CHG, not too dissimilar from later Bronze Age Steppe pastoralists.


          scale 1.414 1.414 

            CHG 1.414 0.000 

            EHG 0.000 1.414 


best coefficients: 0.581 0.419 

DATES can be used to estimate a date for the admixture event between EHG and CHG that created Steppe Eneolithic. 

In this run, I’m using a combined set of Neolithic Iranian samples (Ganj_Dareh_N) and the two CHG individuals from Georgia (Kotias and Satsurblia).

The result is approximately 3000 years before the time of the Steppe Eneolithic individuals.

jmean: 0.000 std. err: 97.857 jmean (years): 0.000 std. err(years): 2837.853

dates_expfit version: 200

step (Morgans) :: 0.001000

fitting 1 exponentials + affine

after initialization: 0.000001 0.907 

gslsetup called

gslans: 19 0.000001


error sd: 0.001026

halflife: 7.113 

mean (generations): 97.446 

    0.007478396 0.000112790 

##end of run

One major difference between the later Yamnaya and these Steppe Eneolithic samples is that the former carries some ancestry related to Neolithic European Farmers. 

If I had to hazard a guess, the 4000-4400 date mentioned in the preprint most likely represents the entry of EEF(Early European Farmer) ancestry into the Steppe gene pool rather than the original EHG/CHG admix event.

Multiple admixture events can confound results given by DATES. I chose to use Steppe Eneolithic rather than the Yamnaya because the latter harbor Early European Farmer ancestry which may influence the results.

Confounding of admixture timing due to multiple admixture events was mentioned in the supplementary materials of “Formation of Human Populations in South and Central Asia” [3]:

We note that we currently do not model multiple admixture events in DATES and this in principle leads to confounding of the admixture timing. To mitigate the effects of the confounding, we focus on the most recent admixture event only, for which we have the highest power and minimal confounding. 

When talking about the formation of the Yamnaya genetic profile, the preprints' authors almost certainly mean the admixture of the two main components, EHG and CHG. They however cannot directly prove that the 5th-millennium date given by their analysis with DATES is the EHG/CHG admixture event as a similar date can be reproduced when using EHG alongside Neolithic Barcin as the admixing populations.

Regardless, we know that people with "Steppe ancestry" existed at the very least 500 years before 4400 BC since the oldest Steppe_Eneolithic sample, an individual from Progress, is carbon-dated to around 4900 BC.

DATES results using EHG and Barcin (Anatolia_N) for Yamnaya:

qpAdm model of the Yamnaya as a mix of EHG, CHG, and Ukrainian Trypillia: 

left pops:






scale     1.732     1.732     1.732  

EHG     1.732     0.000     0.000 

CHG     0.000     1.732     0.000 

Ukraine_Eneolithic_Trypillia     0.000     0.000     1.732 


 best coefficients:     0.458     0.434     0.109 

 std. errors:     0.012     0.017     0.015

left pops: 







 scale     1.732     1.732     1.732 

 EHG     1.732     0.000     0.000 

 CHG     0.000     1.732     0.000 

 Ukraine_Eneolithic_Trypillia     0.000     0.000     1.732 


best coefficients:     0.422     0.407     0.171 

 std. errors:     0.020     0.029     0.027

An interesting thing to note is that the Yamnaya from Ukraine carry more EEF ancestry than those from Samara, an indication that Western Yamnaya had more interactions with neighboring farming cultures than Yamnaya groups living further East.

Another reason why the 5th millennium BC doesn’t make sense for the formation of the Yamnaya genetic profile is that by that point there was no longer any “pure” CHG left that could have plausibly mixed with Eastern Hunter-Gatherers and left the Yamna with as much as 45%+ CHG ancestry.

Individuals from the Chalcolithic Caucasian Darkveti-Meshoko culture, dated to 4500 BC, harbour around 40% non-CHG ancestry, mainly related to Anatolian farmers. There is a further decline in CHG ancestry in the later Bronze Age Maykop culture.

qpAdm models of Darkveti-Meshoko and Maykop

Using DATES, we can approximate when Anatolian ancestry arrived in the North Caucasus. The Maykop samples showed an average date of 6000 BC when using CHG and Barcin as the admixing populations.

Maykop Novosvobodnaya

dates_expfit version: 200

step (Morgans) ::     0.001000

fitting 1 exponentials + affine

after initialization:     0.000001     0.910 

gslsetup called

gslans:   19     0.000001


error sd:     0.001025

halflife:     7.393 

mean (generations):    93.761 

jmean:     0.000 std. err:     88.525 jmean (years):      0.000 std. err(years):  2567.225

Late Maykop

dates_expfit version: 200

step (Morgans) ::     0.001000

fitting 1 exponentials + affine

after initialization:     0.000002     0.870 

gslsetup called

gslans:   19     0.000002


error sd:     0.001309

halflife:     4.981 

mean (generations):   139.169 

 jmean:      0.000 std. err:    115.811 jmean (years):      0.000 std. err(years):   3358.519


As always, Altvred provides amazing information. This guy is a wizard, folks. If you are not a regular  He's been bombarding my inbox with QPADM analysises and ALDER/DATES over the last few months. Now what is great is that for many of the populations which he tested we have historical or archaeological knowledge of when their mixing began, and the results produced are pretty much always consistent. Thus I will be absolutely vouch for any date he manages to produce.

Altvred also pretty much covered all of the points in terms of ancient genetics as for why I cannot agree with this conclusion, thus no need for me to repeat them.

The only point I will add is that Mesolithic Eastern European hunter-gatherers have turned up with Y-DNA haplogroup J. One of these is I0221 from Yuzhnyy Oleni Ostrov site in Karelia, and dates to about 5500 BC [4]

Another mesolithic sample is Popovo2, coming from the Mesolithic Popovo site. This sample dates to 7500-5000 BC but these dates may have been impacted by reservoir effects [5].

I'm going to add some spice to it by tying it into some points from the field of archaeology. I guess some quotes from David W. Anthony to set the stage will do:

Many archaeologists  have  wondered  if  domesticated  cattle  and  sheep  might  have  entered  the  steppes  through  the  Eneolithic  farmers  of  the  Caucasus as  well  as  from  Old  Europe.  Farming  cultures  had  spread  from  the  Near East  into  the  southern  Caucasus  Mountains  (Shulaveri,  Arukhlo,  and  Shen- gavit)  by  5800-5600  BCE.  But  these  earliest  farming  communities  in  the Caucasus  were  not  widespread;  they  remained  concentrated  in  a  few  river- bottom  locations  in  the  upper  Kura  and  Araxes  River  valleys.  No  bridging sites  linked  them  to  the  distant  European  steppes,  more  than  500  km  to  the north  and  west.  The  permanently  glaciated  North  Caucasus  Mountains,  the highest  and  most  impassable  mountain  range  in  Europe,  stood  between them  and  the  steppes.  The  bread  wheats  {THticum  aestivuni)  preferred  in  the  Caucasus  were  less  tolerant  of  drought  conditions  than  the  hulled  wheats (emmer,  einkorn)  preferred  by  Cri§,  Linear  Pottery,  and  Bug-Dniester  cultivators. The  botanist  Zoya  Yanushevich  observed  that  the  cultivated  cereals that  appeared  in  Bug-Dniester  sites  and  later  in  the  Pontic-Caspian  steppe river  valleys  were  a  Balkan/Danubian  crop  suite,  not  a  Caucasian  crop suite. Nor  is  there  an  obvious  stylistic  connection  between  the  pottery  or artifacts  of  the  earliest  Caucasian  farmers  at  Shulaveri  and  those  of  the  earliest herders  in  the  steppes  off  to  the  north.


In  the  western  part  of  the  North  Caucasian  piedmont,  overlooking  the  steppes,  the  few  documented  Eneolithic  communities  had  stone  tools and  pottery  somewhat  like  those  of  their  northern  steppe  neighbours;  these communities  were  southern  participants  in  the  steppe  world,  not  northern extensions  of  Shulaveri-type  Caucasian  farmers [6].

As has been laid out here, a movement of CHG/Iran_N during the eneolithic from the Caucasus into the steppes is rather unlikely for several reasons. These eneolithic communities would've been the same sort of people as the Progress_En and Vonyuchka_En samples discussed earlier. 

Here is a map of the copper age network often referred to as the Carpatho-Balkan metallurgical province [7]:

Schematic map of the Carpatho-Balkan metallurgical province area. A-Central bloc of settled farming cultures and communities. A-1-Butmir; A-2-Vinca C/D; A-3-Karanovo V-Maritsa; A-4-Karanovo VI-Gumelniöa; A-5-Varna; A-6-Lengyel; A-7-Tiszapolgar; A-8-Bodrogkresztur. B-Cultural block Cucuteni-Tripol'ye. C-block of the steppe stock-breeding cultures. C-1-Dnepro-Donets or Mariupol'; C-2-Sredni Stog; C-3-Khvalynsk.

This was more or less the main route for the dispersal of agriculture knowhow across the steppes. It was the main route for the spread of metal products and the knowledge of metallurgy. While there are some signs of contact and trade between the steppes and the caucasus during the eneolithic, it is rather limited. It is during the bronze age, with the onset of the Yamnaya and Catacomb culture that trade and interactions with the Caucasus began to increase, with the onset of the Circumpontic Metallurgical province [7].

Here is another segment highlighting the importance of southeastern europe for the developments on the steppes during the eneolithic [8]:

The transition from the Neolithic to Eneolithic in the Eastern European steppe was connected with the intensive contacts of people of the Azov-Dnieper, Low Don, Pricaspiy, Samara, Orlovka and Sredniy Stog cultures with the Balkan population and rst with the Hamangia culture. The results of these contacts were some im-ports: adornments from copper, cornelian, marine shells and pots in the steppe sites and plates from the bone and nacre, pendants from teeth of red deer in the Hamangia graves. The Hamangia inuence in the burial rites of the steppe population was very important and caused to use stone in graves and above them, pits with alcove, new adornments of burial clothes. The strongest impact we have xed for the population in northern area of the Sea of Azov, where the radical changes in the burial rite and the formation of a new Sredniy Stog culture took place. It was connected with the adoption of new religious elements connected with the formation of the centre of steppe metal working.

Given that 4400-4000 BC is an unlikely date for the admixture between EHG and CHG-related populations, and may reflect or was influenced by the admixture dates between steppe populations and Early European farmers, it might be interesting to have a little look at the onset of the pottery Neolithic in these regions.

In particular, I  think the early neolithic sites of the Dzhangar and Kairshak type might potentially be relevant to the topic. 

In Figure 1 we present the sites where the Kairshak pottery type dated to c. 7th millennium BC was found in the semi-desert northern coast of the Caspian Sea (Fig. 1), This type of pottery is archaic in style. The flat-bottomed vessels were made of organic-rich silt and have geometric ornamentation. The stone industry is closely analogous to the local Mesolithic stone industry, which is characterised by artefacts such as geometric microliths in the form of segments and parallelograms. These features of material culture are evidence of the local origin of this Neolithic culture (Kozin 2002.1–16).

In the north-western part of the Caspian Sea coast,the earliest sites of the earliest stage of Jangar type(Tubuzgukhuduk site) (Fig. 7) date to the first part of the 7th millennium BC according to P. M. Koltsov (Koltsov 2005). According to the features of the flint tools and some pottery characteristics, the Neolithisation of this territory began from the Caucasus; for example, the arrowheads and trapezes of the north-western Caspian Sea and the Caucasus are similar (Koltsov 2005). At the same time, some innovations were linked to local populations. The main innovation was the appearance of pottery making traditions. In the middle of the 7th millennium BC,the populations which produced the Kairshak type migrated from the northern Caspian Sea region to-wards the steppe region of the Volga River basin andthe north-western coast of the Caspian. This process was probably triggered by paleoclimatic changes.The bearers of the Kairshak and Jangar cultures in-fluenced the formation of the Orlovskaya culture in the lower part of the Volga basin (Varfolomeevska-ya site) (Fig. 9). [9]

Microliths of the Neolithic Seroglazovka culture [10]

However, the links to the Caucasus are not limited to the pottery neolithic period. A dissertation on the mesolithic of the Northwest Caspian by P. Koltsov region points out that the Mesolithic inhabitants here had a strong connection to the Caucasus and the adjacent steppe regions [11]. This quote summarises it well. The article is in Russian, this translation came by way of google translate.

The Mesolithic of the North Caspian region is migratory in origin. The presence in the manufacturing techniques of geometric microliths of Geluan retouching, the presence in the inventory of inserts such as parallelograms and rectangles, trapeziums with underworking of the upper base, arrowheads with a pronounced petiole, speak of a single source of cultural influence. Such a source is the Caucasus region, possibly with the participation of the Crimea. The differences in the development trends of the Stone Age cultures on the right bank and the left bank of the Volga fix the local independence of the flint industry, which is the same in origin.

As noted, in the mesolithic period this region already had a strong connection to the material traditions seen in the Mesolithic Caucasus sites. The people of these sites would likely have a strong genetic connection to the populations of the Caucasus at the time as well, which would have been the Caucasus hunter-gatherers.

Microliths of the Mesolithic Zhekolgan group [10]

David W. Anthony had envisioned the appearance of these Caspian materials on the Volga as coming by way of "pure" CHG foragers which migrated from the south Caspian to the Volga  estuary in the 7th millennium BC in the past [12].

The hunter-fisher camps that first appeared on the lower Volga around 6200 BC could represent the migration northward of un-admixed CHG hunter-fishers from the steppe parts of the southeastern Caucasus, a speculation that awaits confirmation from aDNA. After 5000 BC domesticated animals appeared in these same sites in the lower Volga, and in new ones, and ingrave sacrifices at Khvalynsk and Ekaterinovka. CHG genes and domesticated animals flowed north up the Volga, and EHG genes flowed south into the North Caucasus steppes, and the two components became admixed.

With Altvred's findings in mind, as well as the rumour mill of upcoming samples from this period, it is more likely that these populations which spread across the Volga during the pottery Neolithic were similar to the Steppe_Eneolithic samples uncovered in the North Caucasus piedmont. This makes it likely that the Caucasus Hunter-Gatherer ancestry was already present in the region during the Mesolithic, rather than appearing  and spreading with the onset of the pottery Neolithic. 

What should be noted is that the neolithisation of the Volga region happened without domesticated animals aside from the dog, it was entirely a forager driven process [13].

It may be that this spread along the Volga ultimately is what led to the EHG/CHG rich profile to be present around the Don river, which according to those same rumour mills will also be a rather important region for the ethnogenesis of Proto-Indo-Europeans. But there were mesolithic connections between the Don region and the Northwest Caspian as well, which may have also been coupled with geneflows.

However since samples from the southern russian steppes during the mesolithic and neolithic are more or less completely lacking, it is hard to exactly figure out where these ancestries would've converged, and how widespread these EHG/CHG populations would've been during the Mesolithic and pottery Neolithic.  It really is a bit of a blindspot in terms of archaeology and ancient genetics.  I wouldn't automatically assume that the spread of EHG/CHG foragers in the Volga region is directly tied to the EHG/CHG ancestry carried by the Yamnaya, Afanasievo and Corded Ware cultures. However, it certainly is a viable candidate.  All I can hope for is more data in the future which will help answer the remaining questions we all have.

One thing seems to be absolutely certain though: The presence of CHG-related ancestry in the Eastern European regions north of the Caucasus dates back to the Mesolithic, as did the admixture events between EHG and CHG-related populations. The spread of this ancestry has no relation to the spread of agricultural techniques in the steppes, but rather to foragers who may have had to disperse after climatic changes.

Hopefully, these findings will reach the authors of the article, as well as various co-workers in the field. It would be highly interesting to see some of these points brought up in the eventual peer-review file when this article gets accepted and properly published.

A special thanks to Altvred, for providing this data and letting me share it with the masses. If you shared this post with others, then a big thank you to you as well!


  1. Wang, CC., Reinhold, S., Kalmykov, A. et al. Ancient human genome-wide data from a 3000-year interval in the Caucasus corresponds with eco-geographic regions. Nat Commun 10, 590 (2019).

  2.  Gallego-Llorente, M., Connell, S., Jones, E. R., Merrett, D. C., Jeon, Y., Eriksson, A., Siska, V., Gamba, C., Meiklejohn, C., Beyer, R., Jeon, S., Cho, Y. S., Hofreiter, M., Bhak, J., Manica, A., & Pinhasi, R. (2016). The genetics of an early Neolithic pastoralist from the Zagros, Iran. Scientific reports, 6, 31326.

  3. Narasimhan, V. M., Patterson, N., Moorjani, P., Rohland, N., Bernardos, R., Mallick, S., Lazaridis, I., Nakatsuka, N., Olalde, I., Lipson, M., Kim, A. M., Olivieri, L. M., Coppa, A., Vidale, M., Mallory, J., Moiseyev, V., Kitov, E., Monge, J., Adamski, N., Alex, N., … Reich, D. (2019). The formation of human populations in South and Central Asia. Science (New York, N.Y.), 365(6457), eaat7487.

  4. Mathieson, I., Lazaridis, I., Rohland, N. et al. Genome-wide patterns of selection in 230 ancient Eurasians. Nature 528, 499–503 (2015).

  5. Mittnik, A., Wang, CC., Pfrengle, S. et al. The genetic prehistory of the Baltic Sea region. Nat Commun 9, 442 (2018).

  6. Anthony, David W. - The Horse, the wheel and Language.

  7. Chernykh, Evgeny. (2008). The “Steppe Belt” of stockbreeding cultures in Eurasia during the Early Metal Age. Trabajos de Prehistoria. 65. 10.3989/tp.2008.08004. 

  8. Kotova, Nadezhda. (2018). The contacts of the Eastern European steppe people with the Balkan population during the transition period from Neolithic to Eneolithic

  9. Vybornov, Alexander & Kulkova, M. & Andreev, Konstantin & Nesterov, Eugeny. (2018). Radiocarbon chronology of the Neolithic in the Povolzhye (Russian Eastern Europe). Documenta Praehistorica. 44. 224. 10.4312/dp.44.14. E

  10. Vybornov, Alexander & Koltsov, Piotr & Kulkova, M.. (2020). Геометрические микролиты в мезолите и неолите Северного Прикаспия и степного Поволжья. (.(Mesolithic and Neolithic Northern Cis-Caspian and Volga Steppe: Geometric Microliths) Oriental Studies. 13. 106-121. 10.22162/2619-0990-2020-47-1-106-121

  11. Koltsov, Petr Mikhailovich (2005) - Мезолит и неолит Северо-Западного Прикаспия (Mesolithic and Neolithic of the North-Western Caspian)

  12. Anthony, David W. (2019). Archaeology, Genetics, and Language in the Steppes: A Comment on Bomhard. Journal of Indo-European Studies.

  13. Vybornov, A., Kosintsev, P., & Kulkova, M. (2015). The origin of farming in the Lower Volga Region. Documenta Praehistorica, 42, 67.