Musaeum Scythia: Neolithic Don river foragers were a key component of Western Steppe Herder ancestry

Saturday, May 14, 2022

Neolithic Don river foragers were a key component of Western Steppe Herder ancestry

While the question “When did the Western Steppe Herder genetic profile form?” was already covered here over at Museaum Scythia, us steppe affinicionados just received a major clue when it comes to the question of “Where did the western Steppe Herder genetic profile form? from a massive upcoming paper authored by Mortin Alletoft, Martin Sikora, and Eske Willerslev to name a few of the many contributing authors.

Population Genomics of Stone Age Eurasia
DOI: https://doi.org/10.1101/2022.05.04.490594

Abstract
The transitions from foraging to farming and later to pastoralism in Stone Age Eurasia (c. 11-3 thousand years before present, BP) represent some of the most dramatic lifestyle changes in human evolution. We sequenced 317 genomes of primarily Mesolithic and Neolithic individuals from across Eurasia combined with radiocarbon dates, stable isotope data, and pollen records. Genome imputation and co-analysis with previously published shotgun sequencing data resulted in >1600 complete ancient genome sequences offering fine-grained resolution into the Stone Age populations. We observe that: 1) Hunter-gatherer groups were more genetically diverse than previously known, and deeply divergent between western and eastern Eurasia. 2) We identify hitherto genetically undescribed hunter-gatherers from the Middle Don region that contributed ancestry to the later Yamnaya steppe pastoralists; 3) The genetic impact of the Neolithic transition was highly distinct, east and west of a boundary zone extending from the Black Sea to the Baltic. Large-scale shifts in genetic ancestry occurred to the west of this "Great Divide", including an almost complete replacement of hunter-gatherers in Denmark, while no substantial ancestry shifts took place during the same period to the east. This difference is also reflected in genetic relatedness within the populations, decreasing substantially in the west but not in the east where it remained high until c. 4,000 BP; 4) The second major genetic transformation around 5,000 BP happened at a much faster pace with Steppe-related ancestry reaching most parts of Europe within 1,000-years. Local Neolithic farmers admixed with incoming pastoralists in eastern, western, and southern Europe whereas Scandinavia experienced another near-complete population replacement. Similar dramatic turnover-patterns are evident in western Siberia; 5) Extensive regional differences in the ancestry components involved in these early events remain visible to this day, even within countries. Neolithic farmer ancestry is highest in southern and eastern England while Steppe-related ancestry is highest in the Celtic populations of Scotland, Wales, and Cornwall (this research has been conducted using the UK Biobank resource); 6) Shifts in diet, lifestyle and environment introduced new selection pressures involving at least 21 genomic regions. Most such variants were not universally selected across populations but were only advantageous in particular ancestral backgrounds. Contrary to previous claims, we find that selection on the FADS regions, associated with fatty acid metabolism, began before the Neolithisation of Europe. Similarly, the lactase persistence allele started increasing in frequency before the expansion of Steppe-related groups into Europe and has continued to increase up to the present. Along the genetic cline separating Mesolithic hunter-gatherers from Neolithic farmers, we find significant correlations with trait associations related to skin disorders, diet and lifestyle and mental health status, suggesting marked phenotypic differences between these groups with very different lifestyles. This work provides new insights into major transformations in recent human evolution, elucidating the complex interplay between selection and admixture that shaped patterns of genetic variation in modern populations.

I think I need to plan some days off for when this article comes out because there will be tons of data to sift through when this article is out, this preprint is already a feast. But considering this is a steppe blog, I will focus on what I will think will be the new buzzword in the digital ancient Genetics community of 2022: The Don river foragers.
Interestingly, two herein reported ~7,300-year-old imputed genomes from the Middle Don River region in the Pontic-Caspian steppe (Golubaya Krinitsa, NEO113 & NEO212) derive ~20-30% of their ancestry from a source cluster of hunter-gatherers from the Caucasus (Caucasus_13000BP_10000BP) (Fig. 3). Additional lower coverage (nonimputed) genomes from the same site project in the same PCA space (Fig. 1D), shifted away from the European hunter-gatherer cline towards Iran and the Caucasus. Our results thus document
genetic contact between populations from the Caucasus and the Steppe region as early as 7,300 years ago, providing documentation of continuous admixture prior to the advent of later nomadic Steppe cultures, in contrast to recent hypotheses, and also further to the west than previously reported.

A stiff jab to the Indo-European migration theory favoured by many of the geneticists in this field such as David Reich and his team at Harvard and Johannes Krause and his team at MPI. That theory wasn’t going to work out anyways, as most of the community had realized years ago. It is good that geneticists are catching up as well, or at least are putting it to paper. It most surely calls in question the scenario for WSH genetic formation as proposed by N. Patterson in Reconstructing the spatiotemporal patterns of admixture during the European Holocene using a novel genomic dating method”, which I replied to in my blog entry When did the Western Steppe Herder genetic profile form?. It will be interesting to say the least to see what the responses from the other geneticists in the field will be to these findings.
Fig 2. Genetic structure of European hunter-gatherers (A) Ancestry proportions in 113 imputed ancient genomes representing European hunter-gatherer contexts (right) estimated from supervised non-negative least squares analysis using deep Eurasian source groups (left). Individuals from target groups are grouped by genetic clusters.

From approximately 5,000 BP, an ancestry component appears on the eastern European plains in Early Bronze Age Steppe pastoralists associated with the Yamnaya culture and it rapidly spreads across Europe through the expansion of the Corded Ware complex (CWC) and related cultures20,21. We demonstrate that this “steppe” ancestry (Steppe_5000BP_4300BP) can be modelled as a mixture of ~65% ancestry related to herein reported hunter-gatherer genomes from the Middle Don River region (MiddleDon_7500BP) and ~35% ancestry related to hunter-gatherers from Caucasus (Caucasus_13000BP_10000BP) (Extended Data Fig. 4). Thus, Middle Don hunter-gatherers, who already carry ancestry related to Caucasus hunter-gatherers (Fig. 2), serve as a hitherto unknown proximal source for the majority ancestry contribution into Yamnaya genomes. The individuals in question derive from the burial ground Golubaya Krinitsa (Supplementary Note 3). Material culture and burial practices at this site are similar to the Mariupol-type graves, which are widely found in neighbouring regions of Ukraine, for instance along the Dnepr River. They belong to the group of complex pottery-using hunter-gatherers mentioned above, but the genetic composition at Golubaya Krinitsa is different from the remaining Ukrainian sites (Fig 2A, Extended Data Fig. 4).

Here is the information on the Golubaya Krinitsa site from the supplementary files:

Golubaya Krinitsa, Middle Don, Russia. Cemetery. A.M. Skorobogatov The site was discovered in 2011 by Valery Berezutsky 132. The burial ground is located on the right bank of the Black Kalitva River (a tributary of the Don River), near its mouth. Excavations were carried out in 2015-2016 under the leadership of Andrey Skorobogatov. A total of 18 burials were studied (single, paired and collective). The burials were in rectangular pits, characterised by orientation to the south - southeast and southeast. The position of the buried is stretched out on the back, with arms located along the body. The bones are sprinkled with red ochre. The burials were accompanied by inventory: fossil sea shells, Unio shells and products from their wings, bone decorations (wild boar fangs, beaver teeth and groundhogs), bone tools, a copper product, flint tips, flint knives, and ceramics. The complex finds analogies in the Mariupol-type burial grounds widespread in the territory of modern Ukraine (Mariupol, Nikolsky, Lysogorsky, Yasinovatsky burial grounds), and can date back to the 6th millennium BC. Six samples were analysed, with datings ranging ca 6400-6700 uncal BP, corresponding to c. 5000-5400 cal BC: 120 NEO113, kurgan 10, burial 10 NEO204, burial 4 NEO207, burial 7 skeleton 2 NEO209, burial 7 skeleton 4 NEO210, burial 8 NEO212, burial 10 Literature: Berezutsky et al. 2011132.

Sample ID	Y-DNA	MTDNA	Age
NEO113	R1a	U2e1a	5348 BC
NEO212	I2a1b1a2	U4a2a	5443 BC

Now these findings are certainly big news, but not a surprising one to the people who are regulars over at the Eurogenes blog community, because this falls perfectly in line with what Davidski had been saying for years at this point. Big Dave Davidski deserves most of the bragging rights, but I will toot my horn for a bit with this cheeky blast from the past of mine:

https://eurogenes.blogspot.com/2021/03/against-conventional-wisdom.html

I thought that the Western steppe Herder profile being the result of two populations with varying amounts of CHG related ancestries (one higher, one lower), as well minor European farmer input would make a lot of sense over a year ago. A bit more than a year later and here we are!

It would have been interesting if they had used progress/Vonyuchka (Steppe_Eneolithic) as a reference for the excess CHG, because it is unlikely to be the case that pure “CHG” populations were still around at that time. If Steppe_Eneolithic populations ~50% CHG, and these Don foragers have 30%, the CHG-related ancestry of 1-to-1 mixed offspring would be around 40%. If a 60/40 EHG/CHG population then gets 10% EEF ancestry that 40% would drop down to 36%, which is more or less the exact percentage Global25 gives for the amount of CHG ancestry in Steppe_EMBA.

What is strange however is that the authors state that Yamnya can be modeled as a mix of 65% Don foragers and 35% CHG related ancestry. If you think about it, this is mathematically infeasible. If the Don foragers ranged between 20 and 30% CHG, then the CHG ancestry mediated from those Don foragers would be 13% and 19.5%. With an extra 35% on top you would have CHG ancestry ranging between 48 and 54,5. Furthermore, as had been shown in Wang 2019, and argued by Davidski since the yesteryears, Steppe_EMBA carries Early European farmer ancestry and a 65% Don Forager 35% CHG model cannot account for that stream of ancestry.

Because these samples are not out yet, I decided to play around on Genoplot and I made some simulated coordinates for personal use. I will share some of the results here, as well as the coordinates, but I cannot guarantee the resulting coordinates will be exactly like these upcoming Don foragers. Only time will tell, thus take all of this with a grain of salt or two in the meantime!

The first step was to remove the farmer ancestry from Yamnaya Samara average, for which I used the G25 average of the Ukrainian Trypillian culture samples, coming out as 8.6%. Then I took that simulated coordinate, and did a 50% subtraction using Progress_En, and Vonyuchka_En. The results ended up looking like this:

Target	Distance		RUS_Karelia_HG	GEO_CHG	UKR_Trypillia	UKR_N
Yamnaya_RUS_Samara	0.05780318	•	54.8	36.6	8.6	0.0
SIM_Yamnaya_minus_Trypilia	0.06328255	•	60.0	40.0	0.0	0.0
SIM_Don_Forager_VON	0.09929497	•	72.8	24.6	2.6	0.0
SIM_Don_Forager_PROG	0.08850837	•	63.4	30.6	2.2	3.8
Average	0.07722227	•	62.8	33.0	3.3	0.9

The simulated coordinate using the Progress as a subtraction seems to be a great match. Some small frequencies of Tryplia ancestry showed up in the simulated coordinate, but I doubt that is real. It just seems a side effect from the subtraction method. Without those references I got this:

Target: SIM_Don_Forager_PROG

Distance: 8.8643% / 0.08864262

61.2 RUS_Karelia_HG

32.0 GEO_CHG

6.8 UKR_N

Target: DonForager_Sim_VON

Distance: 9.9576% / 0.09957630

73.4 RUS_Karelia_HG

26.0 GEO_CHG

0.6 UKR_N

Using the Global25 West Eurasia PCA, here are the positions of simulated coordinates compared to a bunch of relevant ancient genomes:

Coordinates (scaled):

SIM_Yamnaya_minus_Trypilia,0.12572772,0.08173409,0.04288392,0.13042565,-0.03831749,0.0522349,0.00369911,-0.00278374,-0.06393434,-0.08486282,0.00268118,0.0003042,-0.00193857,-0.02596296,0.04054442,0.01584923,-0.00501566,-0.00306472,-0.005586,0.01567796,-0.00318016,-0.00008071,0.01168853,0.02206198,-0.00400613

SIM_Don_Forager_PROG,0.13706294,0.08781118,0.06955134,0.1450548,-0.02231748,0.0548273,0.00669322,0.00274002,-0.05434218,-0.08589664,-0.00178264,-0.0010401,-0.00030914,-0.02687892,0.04973734,0.02897996,0.00235518,-0.00277194,-0.0089725,0.02460292,0.00025268,-0.00312892,0.01357906,0.03797896,-0.00513826

SIM_Don_Forager_VON,0.13877044,0.08222618,0.08275084,0.1629823,-0.01723898,0.0584528,0.00175822,0.00527852,-0.04830868,-0.08425664,0.00081536,-0.0025386,-0.00432314,-0.03472292,0.04308684,0.01923546,0.00418068,-0.00283544,-0.010544,0.02597792,-0.01459532,0.00095158,0.01425706,0.05195596,0.00156774

Now the interesting part is to gauge the amount of variety in terms of amount of “Don forager” ancestry there was present in various Western steppe Herder genomes. I chose two Yamnaya clusters, The Afanasievo samples from the Russian Altai and several early Corded Ware samples that seem maximized in WSH ancestry.

Yamnaya from Samara:

Target	Distance		SIM_Don_Forager_PROG	RUS_Progress_En	UKR_Trypillia	UKR_N
Yamnaya_RUS_Samara:I0429	0.02350929	•	57.8	36.8	5.4	0.0
Yamnaya_RUS_Samara:I0439	0.02245629	•	51.2	35.8	13.0	0.0
Yamnaya_RUS_Samara:I0370	0.02138359	•	49.4	41.8	8.8	0.0
Yamnaya_RUS_Samara:I0444	0.02890991	•	45.8	43.6	7.4	3.2
Yamnaya_RUS_Samara:I0438	0.02161781	•	44.0	51.0	5.0	0.0
Yamnaya_RUS_Samara:I7489	0.02621860	•	38.2	54.4	7.4	0.0
Yamnaya_RUS_Samara:I0443	0.02665377	•	37.0	53.4	8.4	1.2
Yamnaya_RUS_Samara:I0231	0.01831591	•	36.2	51.2	7.4	5.2
Yamnaya_RUS_Samara:I0357	0.01835266	•	34.2	50.4	13.0	2.4
Average	0.02304643	•	43.8	46.5	8.4	1.3

Yamnaya from Kalmykia:

Target	Distance		SIM_Don_Forager_PROG	RUS_Progress_En	UKR_Trypillia	UKR_N
Yamnaya_RUS_Kalmykia:RISE546	0.05267381	•	48.8	45.6	5.6	0.0
Yamnaya_RUS_Kalmykia:RISE240	0.03285146	•	46.8	39.8	13.4	0.0
Yamnaya_RUS_Kalmykia:RISE550	0.02225610	•	45.0	41.4	13.0	0.6
Yamnaya_RUS_Kalmykia:RISE552	0.02609195	•	34.2	58.6	7.2	0.0
Yamnaya_RUS_Kalmykia:RISE547	0.02992127	•	33.2	51.6	10.2	5.0
Average	0.03275892	•	41.6	47.4	9.9	1.1

High steppe_EMBA early Corded Ware:

Target	Distance		SIM_Don_Forager_PROG	RUS_Progress_En	UKR_Trypillia	UKR_N
Corded_Ware_CZE_early:OBR003	0.01997238	•	68.2	14.4	17.4	0.0
Corded_Ware_Baltic_early:Plinkaigalis242	0.02452369	•	58.4	20.6	16.0	5.0
Corded_Ware_CZE_early:PNL001.merged	0.02816527	•	57.0	31.8	10.2	1.0
Corded_Ware_Baltic_early:I4629	0.04521042	•	53.2	32.2	8.4	6.2
Corded_Ware_CZE_early:VLI076	0.02573414	•	48.4	35.2	12.6	3.8
Corded_Ware_CZE_early:VLI090.A0101	0.03456742	•	47.2	29.6	19.6	3.6
Corded_Ware_Baltic_early:Gyvakarai1_10bp	0.02425551	•	45.4	29.8	19.0	5.8
Corded_Ware_CZE_early:VLI007.merged	0.05436050	•	41.6	35.0	15.6	7.8
Average	0.03209867	•	52.4	28.6	14.8	4.1

Afanasievo from the Altai region:

Target	Distance		SIM_Don_Forager_PROG	RUS_Progress_En	UKR_Trypillia	UKR_N
RUS_Afanasievo:I6711	0.04005041	•	67.4	30.6	2.0	0.0
RUS_Afanasievo:I3387	0.03865066	•	56.2	39.8	4.0	0.0
RUS_Afanasievo:I10565	0.02253946	•	51.0	41.8	7.2	0.0
RUS_Afanasievo:I2069	0.02543367	•	50.0	44.8	5.2	0.0
RUS_Afanasievo:I5278	0.03744332	•	47.4	48.4	4.2	0.0
RUS_Afanasievo:I5271	0.03669026	•	45.4	51.4	3.2	0.0
RUS_Afanasievo:I2071	0.02809724	•	45.2	45.2	9.6	0.0
RUS_Afanasievo:I1829	0.02726509	•	43.2	46.0	8.2	2.6
RUS_Afanasievo:I5270	0.03632696	•	40.4	46.4	9.8	3.4
RUS_Afanasievo:I5273	0.02599122	•	40.0	48.2	11.8	0.0
RUS_Afanasievo:I3952	0.02501622	•	38.2	53.4	8.4	0.0
RUS_Afanasievo:I11752	0.02823817	•	37.0	55.4	6.0	1.6
RUS_Afanasievo:I5269	0.02266377	•	36.6	52.6	10.8	0.0
RUS_Afanasievo:I3388	0.01976090	•	35.8	52.6	11.6	0.0
RUS_Afanasievo:I6713	0.02689139	•	35.6	55.2	9.2	0.0
RUS_Afanasievo:I5277	0.03757972	•	35.4	58.8	4.6	1.2
RUS_Afanasievo:I5272	0.03390484	•	33.2	56.8	10.0	0.0
RUS_Afanasievo:I3954	0.02841254	•	29.6	65.4	4.0	1.0
RUS_Afanasievo:I6715	0.02684281	•	28.2	64.4	6.4	1.0
RUS_Afanasievo:I3950	0.03422087	•	28.0	63.6	6.0	2.4
RUS_Afanasievo:I5279	0.04567883	•	26.4	67.0	3.0	3.6
RUS_Afanasievo:I10564	0.03384626	•	25.2	66.6	8.2	0.0
RUS_Afanasievo:I11112	0.03851630	•	21.4	66.8	10.8	1.0
Average	0.03130700	•	39.0	53.1	7.1	0.8

The range in terms of the amount of “Steppe_eneolithic” to “Don forager” ancestry is what immediately stood out to me, and I found it a bit surprising actually. What should be said though is that a small difference in actual CHG ancestry would make a large difference in terms of ancestry reflected by Progress_en and the simulated coordinate as these two are the two only references containing CHG ancestry, one having less than the samples above and the other more.

The interesting thing is that while Corded Ware samples seem to carry the highest amount of this simulated “Don forager” component, some Afanasievo and Yamnaya samples are fully within the same range, but individuals in their groups also are shifted towards Progress_En which brings the average down. This also seems consistent with early Corded Ware samples having a bit less CHG ancestry than the Yamnaya samples, on average.

If this degree of variation will be replicable through different sources, we have somewhat of an interesting scenario out our hands. One explanation could be that you had somewhat of a genetic cline between the “Don forager” cluster and the “Steppe_En” cluster, with the early Yamnaya, Corded Ware and Afanasievo samples originating from variable points on this genetic cline, thus explaining the variety of the samples. That being said, some of these steppe_en samples only slightly predated the aforementioned material culture by a few centuries and you can imagine that when the Yamnaya and Afanasievo had their eastwards expansions they could have come across people with a steppe_en profile and intermixed with them. It might be a combination of both those factors that lead to the variation seen above.

That said I’m not reading too much into the actual percentages for now, but if something similar will be demonstrable through software such as qpadm when the data of this article is finally released, that would be interesting of course.

In the meantime if anyone wants to help me out, calculating the amount of EHG/WHG/CHG/EEF ancestry in these samples would be helpful for me to figure out if the variation seen here is genuine, or if small discrepancies in CHG ancestry are creating a bit of a mirage:

I0429
I0357
OBR003
PNL001
VLI007
I6711
I5270
I11112

For what it is worth, a well-connected friend of mine got his hands on an unpublished Sredny Stog sample from the eastern banks of the Dnieper river dating to 4340-4178 calBCE. Unfortunately this sample does not have Global25 coordinates yet, but my mate converted the raw files into K13 which were then converted into Global25 coordinates, I assume through genoplot. So I wonder what the degree of accuracy is here, but what the hell, who cares right?

Target: SrednyGirl_I2108-K13-sim_scaled

Distance: 6.3291% / 0.06329125

52.2 RUS_Karelia_HG

37.8 GEO_CHG

10.0 UKR_Trypillia

0.0 UKR_N

Target: SrednyGirl_I2108-K13-sim_scaled

Distance: 1.5542% / 0.01554206

51.4 RUS_Progress_En

38.4 SIM_Don_Forager_PROG

10.2 UKR_Trypillia

0.0 UKR_N

So this 5th millennium BC lass, within the early periods of the Proto-Indo-European language is more or less virtually identical to the Yamnaya, Corded Ware and Afanasievo samples from a thousand years later, and in terms of ancestry correlating to the “Simulated Don forager” and “Steppe_Eneolithic” references is also similar and within range.

Coordinates:

SrednyGirl_I2108-K13 sim_scaled,0.1233,0.0904,0.0368,0.1115,-0.0274,0.0393,0.0048,-0.0006,-0.0572,-0.0687,0.0033,0.0016,-0.0052,-0.018,0.0364,0.0101,-0.0085,-0.0009,-0.0031,0.0098,-0.0026,0.0012,0.0109,0.0256,-0.0044

This article will be absolute banger when it comes out, and aside from this segment that is relevant to the origins of the Proto-Indo-European language there is a lot more to look for. I mean over 300 samples and an extensive use of IBD clustering with these samples is nothing to scoff at.

The findings during the Scandinavian late neolithic and bronze age, with the high IBD sharing between the Nordic Bronze age samples carrying I1 lineages and iron age Germanic samples. However, given that Germanic itself is an iron age expansion rather than a bronze age one and the expansion came by people most certainly not limited to I1, this finding in itself does not “solve” the question of Germanic origins, but it definitely puts us closer.

The neolithic and bronze age samples from the Altai and the iron age sample from the Volga are also on my “can't wait until they are published list”, the latter will in my opinion have some implications for the spread and genetic formation of Uralic speakers and their genetic profiles during the late bronze and early iron age. But I will hopefully cover that in due time, I might know of a discussion topic that could cover both of those locations in one swoop.

12 comments:

AnonymousMay 16, 2022 at 10:01 AM
Is there any chance ydna K2b/PQR are not originally from an Eastern Eurasian population. Like I remember UstIshim being in between East and West Eurasians although closer to the latter. Could these lineages be from that type of population or even a Zlaty Kun type population?
ReplyDelete
Replies
AnonymousJune 10, 2022 at 8:34 AM

@Ganesh

Thanks. So it is a definite Tianyuan type people were the source of K2b/P carriers as opposed to something more wester nor Ust-Ishim like?
ReplyDelete
Replies

Add comment