While the question “When did the Western Steppe Herder genetic profile form?” was already covered here over at Museaum Scythia, us steppe affinicionados just received a major clue when it comes to the question of “Where did the western Steppe Herder genetic profile form? from a massive upcoming paper authored by Mortin Alletoft, Martin Sikora, and Eske Willerslev to name a few of the many contributing authors.
Population Genomics of Stone Age Eurasia
DOI: https://doi.org/10.1101/2022.05.04.490594
Abstract
The transitions from foraging to farming and later to pastoralism in Stone Age Eurasia (c. 11-3 thousand years before present, BP) represent some of the most dramatic lifestyle changes in human evolution. We sequenced 317 genomes of primarily Mesolithic and Neolithic individuals from across Eurasia combined with radiocarbon dates, stable isotope data, and pollen records. Genome imputation and co-analysis with previously published shotgun sequencing data resulted in >1600 complete ancient genome sequences offering fine-grained resolution into the Stone Age populations. We observe that: 1) Hunter-gatherer groups were more genetically diverse than previously known, and deeply divergent between western and eastern Eurasia. 2) We identify hitherto genetically undescribed hunter-gatherers from the Middle Don region that contributed ancestry to the later Yamnaya steppe pastoralists; 3) The genetic impact of the Neolithic transition was highly distinct, east and west of a boundary zone extending from the Black Sea to the Baltic. Large-scale shifts in genetic ancestry occurred to the west of this "Great Divide", including an almost complete replacement of hunter-gatherers in Denmark, while no substantial ancestry shifts took place during the same period to the east. This difference is also reflected in genetic relatedness within the populations, decreasing substantially in the west but not in the east where it remained high until c. 4,000 BP; 4) The second major genetic transformation around 5,000 BP happened at a much faster pace with Steppe-related ancestry reaching most parts of Europe within 1,000-years. Local Neolithic farmers admixed with incoming pastoralists in eastern, western, and southern Europe whereas Scandinavia experienced another near-complete population replacement. Similar dramatic turnover-patterns are evident in western Siberia; 5) Extensive regional differences in the ancestry components involved in these early events remain visible to this day, even within countries. Neolithic farmer ancestry is highest in southern and eastern England while Steppe-related ancestry is highest in the Celtic populations of Scotland, Wales, and Cornwall (this research has been conducted using the UK Biobank resource); 6) Shifts in diet, lifestyle and environment introduced new selection pressures involving at least 21 genomic regions. Most such variants were not universally selected across populations but were only advantageous in particular ancestral backgrounds. Contrary to previous claims, we find that selection on the FADS regions, associated with fatty acid metabolism, began before the Neolithisation of Europe. Similarly, the lactase persistence allele started increasing in frequency before the expansion of Steppe-related groups into Europe and has continued to increase up to the present. Along the genetic cline separating Mesolithic hunter-gatherers from Neolithic farmers, we find significant correlations with trait associations related to skin disorders, diet and lifestyle and mental health status, suggesting marked phenotypic differences between these groups with very different lifestyles. This work provides new insights into major transformations in recent human evolution, elucidating the complex interplay between selection and admixture that shaped patterns of genetic variation in modern populations.I think I need to plan some days off for when this article comes out because there will be tons of data to sift through when this article is out, this preprint is already a feast. But considering this is a steppe blog, I will focus on what I will think will be the new buzzword in the digital ancient Genetics community of 2022: The Don river foragers.
Interestingly, two herein reported ~7,300-year-old imputed genomes from the Middle Don River region in the Pontic-Caspian steppe (Golubaya Krinitsa, NEO113 & NEO212) derive ~20-30% of their ancestry from a source cluster of hunter-gatherers from the Caucasus (Caucasus_13000BP_10000BP) (Fig. 3). Additional lower coverage (nonimputed) genomes from the same site project in the same PCA space (Fig. 1D), shifted away from the European hunter-gatherer cline towards Iran and the Caucasus. Our results thus document
genetic contact between populations from the Caucasus and the Steppe region as early as 7,300 years ago, providing documentation of continuous admixture prior to the advent of later nomadic Steppe cultures, in contrast to recent hypotheses, and also further to the west than previously reported.
A stiff jab to the Indo-European migration theory favoured by many of the geneticists in this field such as David Reich and his team at Harvard and Johannes Krause and his team at MPI. That theory wasn’t going to work out anyways, as most of the community had realized years ago. It is good that geneticists are catching up as well, or at least are putting it to paper. It most surely calls in question the scenario for WSH genetic formation as proposed by N. Patterson in Reconstructing the spatiotemporal patterns of admixture during the European Holocene using a novel genomic dating method”, which I replied to in my blog entry When did the Western Steppe Herder genetic profile form?. It will be interesting to say the least to see what the responses from the other geneticists in the field will be to these findings.
Fig 2. Genetic structure of European hunter-gatherers (A) Ancestry proportions in 113 imputed ancient genomes representing European hunter-gatherer contexts (right) estimated from supervised non-negative least squares analysis using deep Eurasian source groups (left). Individuals from target groups are grouped by genetic clusters.
From approximately 5,000 BP, an ancestry component appears on the eastern European plains in Early Bronze Age Steppe pastoralists associated with the Yamnaya culture and it rapidly spreads across Europe through the expansion of the Corded Ware complex (CWC) and related cultures20,21. We demonstrate that this “steppe” ancestry (Steppe_5000BP_4300BP) can be modelled as a mixture of ~65% ancestry related to herein reported hunter-gatherer genomes from the Middle Don River region (MiddleDon_7500BP) and ~35% ancestry related to hunter-gatherers from Caucasus (Caucasus_13000BP_10000BP) (Extended Data Fig. 4). Thus, Middle Don hunter-gatherers, who already carry ancestry related to Caucasus hunter-gatherers (Fig. 2), serve as a hitherto unknown proximal source for the majority ancestry contribution into Yamnaya genomes. The individuals in question derive from the burial ground Golubaya Krinitsa (Supplementary Note 3). Material culture and burial practices at this site are similar to the Mariupol-type graves, which are widely found in neighbouring regions of Ukraine, for instance along the Dnepr River. They belong to the group of complex pottery-using hunter-gatherers mentioned above, but the genetic composition at Golubaya Krinitsa is different from the remaining Ukrainian sites (Fig 2A, Extended Data Fig. 4).
Here is the information on the Golubaya Krinitsa site from the supplementary files:
Golubaya Krinitsa, Middle Don, Russia. Cemetery. A.M. Skorobogatov The site was discovered in 2011 by Valery Berezutsky 132. The burial ground is located on the right bank of the Black Kalitva River (a tributary of the Don River), near its mouth. Excavations were carried out in 2015-2016 under the leadership of Andrey Skorobogatov. A total of 18 burials were studied (single, paired and collective). The burials were in rectangular pits, characterised by orientation to the south - southeast and southeast. The position of the buried is stretched out on the back, with arms located along the body. The bones are sprinkled with red ochre. The burials were accompanied by inventory: fossil sea shells, Unio shells and products from their wings, bone decorations (wild boar fangs, beaver teeth and groundhogs), bone tools, a copper product, flint tips, flint knives, and ceramics. The complex finds analogies in the Mariupol-type burial grounds widespread in the territory of modern Ukraine (Mariupol, Nikolsky, Lysogorsky, Yasinovatsky burial grounds), and can date back to the 6th millennium BC. Six samples were analysed, with datings ranging ca 6400-6700 uncal BP, corresponding to c. 5000-5400 cal BC: 120 NEO113, kurgan 10, burial 10 NEO204, burial 4 NEO207, burial 7 skeleton 2 NEO209, burial 7 skeleton 4 NEO210, burial 8 NEO212, burial 10 Literature: Berezutsky et al. 2011132.
Now these findings are certainly big news, but not a surprising one to the people who are regulars over at the Eurogenes blog community, because this falls perfectly in line with what Davidski had been saying for years at this point. Big Dave Davidski deserves most of the bragging rights, but I will toot my horn for a bit with this cheeky blast from the past of mine:
https://eurogenes.blogspot.com/2021/03/against-conventional-wisdom.html
I thought that the Western steppe Herder profile being the result of two populations with varying amounts of CHG related ancestries (one higher, one lower), as well minor European farmer input would make a lot of sense over a year ago. A bit more than a year later and here we are!
It would have been interesting if they had used progress/Vonyuchka (Steppe_Eneolithic) as a reference for the excess CHG, because it is unlikely to be the case that pure “CHG” populations were still around at that time. If Steppe_Eneolithic populations ~50% CHG, and these Don foragers have 30%, the CHG-related ancestry of 1-to-1 mixed offspring would be around 40%. If a 60/40 EHG/CHG population then gets 10% EEF ancestry that 40% would drop down to 36%, which is more or less the exact percentage Global25 gives for the amount of CHG ancestry in Steppe_EMBA.
What is strange however is that the authors state that Yamnya can be modeled as a mix of 65% Don foragers and 35% CHG related ancestry. If you think about it, this is mathematically infeasible. If the Don foragers ranged between 20 and 30% CHG, then the CHG ancestry mediated from those Don foragers would be 13% and 19.5%. With an extra 35% on top you would have CHG ancestry ranging between 48 and 54,5. Furthermore, as had been shown in Wang 2019, and argued by Davidski since the yesteryears, Steppe_EMBA carries Early European farmer ancestry and a 65% Don Forager 35% CHG model cannot account for that stream of ancestry.
Because these samples are not out yet, I decided to play around on Genoplot and I made some simulated coordinates for personal use. I will share some of the results here, as well as the coordinates, but I cannot guarantee the resulting coordinates will be exactly like these upcoming Don foragers. Only time will tell, thus take all of this with a grain of salt or two in the meantime!
The first step was to remove the farmer ancestry from Yamnaya Samara average, for which I used the G25 average of the Ukrainian Trypillian culture samples, coming out as 8.6%. Then I took that simulated coordinate, and did a 50% subtraction using Progress_En, and Vonyuchka_En. The results ended up looking like this:
The simulated coordinate using the Progress as a subtraction seems to be a great match. Some small frequencies of Tryplia ancestry showed up in the simulated coordinate, but I doubt that is real. It just seems a side effect from the subtraction method. Without those references I got this:
Target: SIM_Don_Forager_PROG
Distance: 8.8643% / 0.08864262
61.2 RUS_Karelia_HG
32.0 GEO_CHG
6.8 UKR_N
Target: DonForager_Sim_VON
Distance: 9.9576% / 0.09957630
73.4 RUS_Karelia_HG
26.0 GEO_CHG
0.6 UKR_N
Using the Global25 West Eurasia PCA, here are the positions of simulated coordinates compared to a bunch of relevant ancient genomes:
Now the interesting part is to gauge the amount of variety in terms of amount of “Don forager” ancestry there was present in various Western steppe Herder genomes. I chose two Yamnaya clusters, The Afanasievo samples from the Russian Altai and several early Corded Ware samples that seem maximized in WSH ancestry.
Yamnaya from Samara:
Yamnaya from Kalmykia:
High steppe_EMBA early Corded Ware:
Afanasievo from the Altai region:
The range in terms of the amount of “Steppe_eneolithic” to “Don forager” ancestry is what immediately stood out to me, and I found it a bit surprising actually. What should be said though is that a small difference in actual CHG ancestry would make a large difference in terms of ancestry reflected by Progress_en and the simulated coordinate as these two are the two only references containing CHG ancestry, one having less than the samples above and the other more.
The interesting thing is that while Corded Ware samples seem to carry the highest amount of this simulated “Don forager” component, some Afanasievo and Yamnaya samples are fully within the same range, but individuals in their groups also are shifted towards Progress_En which brings the average down. This also seems consistent with early Corded Ware samples having a bit less CHG ancestry than the Yamnaya samples, on average.
If this degree of variation will be replicable through different sources, we have somewhat of an interesting scenario out our hands. One explanation could be that you had somewhat of a genetic cline between the “Don forager” cluster and the “Steppe_En” cluster, with the early Yamnaya, Corded Ware and Afanasievo samples originating from variable points on this genetic cline, thus explaining the variety of the samples. That being said, some of these steppe_en samples only slightly predated the aforementioned material culture by a few centuries and you can imagine that when the Yamnaya and Afanasievo had their eastwards expansions they could have come across people with a steppe_en profile and intermixed with them. It might be a combination of both those factors that lead to the variation seen above.
That said I’m not reading too much into the actual percentages for now, but if something similar will be demonstrable through software such as qpadm when the data of this article is finally released, that would be interesting of course.
In the meantime if anyone wants to help me out, calculating the amount of EHG/WHG/CHG/EEF ancestry in these samples would be helpful for me to figure out if the variation seen here is genuine, or if small discrepancies in CHG ancestry are creating a bit of a mirage:
I0429
I0357
OBR003
PNL001
VLI007
I6711
I5270
I11112
For what it is worth, a well-connected friend of mine got his hands on an unpublished Sredny Stog sample from the eastern banks of the Dnieper river dating to 4340-4178 calBCE. Unfortunately this sample does not have Global25 coordinates yet, but my mate converted the raw files into K13 which were then converted into Global25 coordinates, I assume through genoplot. So I wonder what the degree of accuracy is here, but what the hell, who cares right?
Target: SrednyGirl_I2108-K13-sim_scaled
Distance: 6.3291% / 0.06329125
52.2 RUS_Karelia_HG
37.8 GEO_CHG
10.0 UKR_Trypillia
0.0 UKR_N
Target: SrednyGirl_I2108-K13-sim_scaled
Distance: 1.5542% / 0.01554206
51.4 RUS_Progress_En
38.4 SIM_Don_Forager_PROG
10.2 UKR_Trypillia
0.0 UKR_N
So this 5th millennium BC lass, within the early periods of the Proto-Indo-European language is more or less virtually identical to the Yamnaya, Corded Ware and Afanasievo samples from a thousand years later, and in terms of ancestry correlating to the “Simulated Don forager” and “Steppe_Eneolithic” references is also similar and within range.
Coordinates:
SrednyGirl_I2108-K13 sim_scaled,0.1233,0.0904,0.0368,0.1115,-0.0274,0.0393,0.0048,-0.0006,-0.0572,-0.0687,0.0033,0.0016,-0.0052,-0.018,0.0364,0.0101,-0.0085,-0.0009,-0.0031,0.0098,-0.0026,0.0012,0.0109,0.0256,-0.0044
This article will be absolute banger when it comes out, and aside from this segment that is relevant to the origins of the Proto-Indo-European language there is a lot more to look for. I mean over 300 samples and an extensive use of IBD clustering with these samples is nothing to scoff at.
The findings during the Scandinavian late neolithic and bronze age, with the high IBD sharing between the Nordic Bronze age samples carrying I1 lineages and iron age Germanic samples. However, given that Germanic itself is an iron age expansion rather than a bronze age one and the expansion came by people most certainly not limited to I1, this finding in itself does not “solve” the question of Germanic origins, but it definitely puts us closer.
The neolithic and bronze age samples from the Altai and the iron age sample from the Volga are also on my “can't wait until they are published list”, the latter will in my opinion have some implications for the spread and genetic formation of Uralic speakers and their genetic profiles during the late bronze and early iron age. But I will hopefully cover that in due time, I might know of a discussion topic that could cover both of those locations in one swoop.
Is there any chance ydna K2b/PQR are not originally from an Eastern Eurasian population. Like I remember UstIshim being in between East and West Eurasians although closer to the latter. Could these lineages be from that type of population or even a Zlaty Kun type population?
ReplyDeleteI don't really like speculating on events that go back into the upper paleolithic because we have such little information, being it genetics or archaeology. I find it a bit futil and whenever someone in the anthroforum sphereis making grand claims about deep ancestry I tune out and look at archaeological objects.
DeleteThe most relevant point is that y-chromosome haplogroup P has a formation date of 44300 ybp and a TMRCA date at 41500 ybp according to Yfull. The P clade that lead to Q and R also shares some extra snps with those carried by a historical Andamanese, which could push the total separation date a bit more than a thousand years later. At this point you already would've had a distinct genetic separation between East and West Eurasians.
Another point is that Yana and ANE have Denisovan ancestry which correlates with their east Eurasian ancestry according to the Salkhit paper. Bacho Kiro and Ust Ishim samples did not have this ancestry, Tianyuan man did.
But as I said we earlier we lack so much information about this time period so we can't really say for sure how widepread Denisovan-admixed East Eurasians were inbetween 40-30k BC. Or if the P lineages came from Denisovan admixed East Eurasians to begin with, in such a scenario you would've had multiple "east eurasian" introgressions, with the latter ones being more relevant for the autosomal ancestry (incl. Denisovan) and the former to the Y-chromosome haplogroup P. Given how Central Asia is a total blindspot and paleolithic western Siberia is not much better, it is very hard to say what happened.
But to answer your question; the way I see it, the only way for it not to have come from an East Eurasian population is if an "inbetweener" population contributed ancestry to both West Eurasian populations and Andamanese around/after 40K BC which I find a bit unlikely.
So the SE Asian theory is true after all then? Disappointing. Guess this affected the language of ANE and ultimately EHG.
DeleteI know Davidski argues against it being ENA and coming from SE Asia. What is his logic?
DeleteI didn't mention Southeast Asia anywhere. Given the distribution of Oase/Bacho-Kiro, Ust-Ishim (without Altai Denisovan) and Tianyuan man/Salkhit with Altai Denisovan, you clearly had a presence of East Eurasian related peoples above the 40th parallel north very early.
DeleteI don't think you should extrapolate chalcolithic linguistic influences from deep paleolithic ancestries. (First wave) Native American languages are divided into several linguistic families that cannot be shown to have a direct relation to one another despite coming from the same tiny bottlenecked population.
Now imagine looking for linguistic influences (or direct descent) from an unknown upper paleolithic population of which we know nothing, on the Ancient North Eurasians, a population we know very little of and only got identified through population genetics, to EHGs which we also know nothing of in terms of linguistics. That's more than 30000 years of questions marks left unanswered.
As for what Davidski argues for, you should ask the big man himself I think.
true. but I thought oase/bacho-kiro/ust shim were in between Kosetenki-14 and Tianyuan and not really east eurasians like the latter. has the thinking on that changed?
DeleteBachoKiro and Tianyuan forms clad with each other with respect to post-40kya paleo Europeans like K14, Sunghir, Mueirri etc...
DeleteD(Kostenki14, Tianyuan; Bacho Kiro F6-620, Mbuti) z = -4.17
D(Vestonice16, Tianyuan; Bacho Kiro F6-620, Mbuti) z = -3.29
D(SunghirIII, Tianyuan; Bacho Kiro F6-620, Mbuti) z = -3.33
D(GoyetQ116_1, Tianyuan; Bacho Kiro F6-620, Mbuti) z = -1.85
D(Yana_RHS, Tianyuan; Bacho Kiro F6-620, Mbuti) z = -1.58
D(GoyetQ116_1, Kostenki14; Bacho Kiro F6-620, Mbuti) z = +3.24
D(Yana_RHS, Kostenki14; Bacho Kiro F6-620, Mbuti) z = +2.72
He shows clear preference for Tianyuan over early western Eurasian populations except for GoyetQ116_1 and Yana_RHS where |Z| < 2 (or non-significant) meaning he equally share drift with Tianyuan and also with Goyet/Yana. There is BachoKiro and Tianyuan related gene flow into 35kya GoyetQ116_1 from Belgium which quite well known since original Fu et. al. 2016 which shows peeps related to BK and mildly related to Tianyuan survived in far western regions but completely replaced in other regions. Peeps descended from Goyet like El Miron [20kya] from Spain also share drift with ENA populations.
D(SouthAfrica_2000BP.SG, Taiwan_Hanben_IA; Kostenki14, ElMiron) z = +3.74
D(SouthAfrica_2000BP.SG, Andaman_100BP; Kostenki14, ElMiron) z = +4.13
On Ust-Ishim, he quite unrelated to any extant modern population and he did not leave any genetic influence on subsequent period. There is very mild link between Usht-Ishim and BK but not between Tianyuan and Ust-Ishim.
D(Ust-Ishim, Kostenki14; Bacho Kiro F6-620, Mbuti) z = +1.94 [not so much and non-significant]
D(Ust-Ishim, Kostenki14; Tianyuan , Chimp) z = 0
D(Tianyuan , Kostenki14; Ust-Ishim, Chimp) z = 0
D(Ust-Ishim, Tianyuan; Kostenki14, Chimp) z = 0
All three lines, which are Tianyuan, K14 and UI, are equidistant with each other and almost represent trifurcation split.
On Oase 1 and 2,
"Finally, it is worth mentioning that while the low coverage, high contamination and high
Neanderthal ancestry of Oase1 prevented the direct assessment of its closer relationship to
either Western or Eastern Eurasians, an individual from the same site and with similar age
(Oase2) showed a clearly higher affinity for East Asian and Native American populations than
with Western Eurasians. The closest sample to Oase2 in outgroup f3 analyses, after Oase1,
was reported to be Tianyuan (the individuals from Bacho Kiro cave were not available at the
time of those analyses), supporting our claim of its placement in the “genetically East Asian”
branch"
It looks like Oase1 and BK forms clad with respect to west Eurasians.
"It looks like Oase1 and BK forms clad with respect to west Eurasians."
DeleteHere are the supporting stats for this,
D(Oase1, Kostenki14; Bacho Kiro F6-620, Mbuti) z = +4.26
D(Oase1, SunghirIII; Bacho Kiro F6-620, Mbuti) z = +3.88
D(Oase1, Vestonice16; Bacho Kiro F6-620, Mbuti) z = +3.37
D(Oase1, BachoKiro_BK_1653; Bacho Kiro F6-620, Mbuti) z = +3.33
D(Oase1, Villabruna; Bacho Kiro F6-620, Mbuti) z = +3.94
D(Oase1, Bichon; Bacho Kiro F6-620, Mbuti) z = +4.97
D(Oase1, ElMiron; Bacho Kiro F6-620, Mbuti) z = +3.01
D(Oase1, Kolyma_River; Bacho Kiro F6-620, Mbuti) z = +2.78
D(Oase1, MA1; Bacho Kiro F6-620, Mbuti) z = +2.39
D(Oase1, Yana_RHS; Bacho Kiro F6-620, Mbuti) z = +2.38
D(Oase1, Saqqaq; Bacho Kiro F6-620, Mbuti) z = +2.28
D(Oase1, GoyetQ116_1; Bacho Kiro F6-620, Mbuti) z = +2.20
D(Oase1, Tianyuan; Bacho Kiro F6-620, Mbuti) z = +0.94
More the positive Z-score, more the drift with Oase1 with respect to other pop.
ReplyDelete@Ganesh
Thanks. So it is a definite Tianyuan type people were the source of K2b/P carriers as opposed to something more wester nor Ust-Ishim like?
If am not sure where K2b/P originated but they are quite old for Tianyuan/ANS/ANE. The MRCA of K2b* and P* is around 48 kya. The yfull's age for P is underestimation due to various reasons. They are likely very old and present in early colonization efforts of Eurasia which contributed quite lot of ancestry to people living far East so these lineages pilled up in SEA. You have K2a in Ust-Ishim quite early on which suggest same for K2b/P i.e these lineages emerged when Eurasians started to separate. The second wave of people represented by K14 and Sunghir could replace the older people in West. (The second wave is more West Eurasian like and while first wave is either ZK/UI or BachoKiro/Tianyuan/Papuan. The third wave of people is the "Basal Eurasian" one we find in ancient Near East)
DeleteI see. Thanks for the insight Ganesh. Can I ask what is the reason yfull underestimates so much? And what is the relationship between Zlaty Kun and UI?
DeleteAlso how was South Asia settled? What was the first wave? Was it uniform all over?
Delete