Genetic composition of Proto-Uralic speakers?

Jaska

Regular Member
Messages
63
Reaction score
41
Points
18
Ethnic group
Finn
Y-DNA haplogroup
N-BY22114
mtDNA haplogroup
H5a1e
There are many people claiming that the Nganasans are the most similar to the Proto-Uralic speakers, but that is simply not true. It is based on serious misunderstandings. Here I show the mistakes of such a claim.

1. We cannot just arbitrarily decide that the Proto-Uralic speakers were 100 % of the Yakutia ancestry, that would be absurd. In the Uralic populations, the Siberian ancestries diminish toward the west, and the European ancestries diminish toward the east. The scale for the genetic composition for the Uralic populations is from 100 % of European ancestry to 100 % of Siberian ancestry. Depending on where the linguistic results locate the Proto-Uralic homeland, the genetic composition of its speakers changes accordingly (assuming that the genetic composition even could be deduced from the genetic compositions of modern Uralic speakers, which is uncertain).

2. We do not even know the genetic composition of the Proto-Uralic speakers, so we have nothing to compare to. The only scientific way is to accept the linguistic results, and the best-argumented view locates Late Proto-Uralic in the Central Ural Region. https://journal.fi/fuf/article/view/120910 At the moment we have no ancient DNA from the right region at the right time. Therefore, we cannot yet know the genetic composition of the Proto-Uralic speakers.

3. The Nganasan language is quite a recent newcomer from the south, because Proto-Samoyedic was spoken in the Sayan Region some 2000 years ago. The Nganasans also differ clearly from other Samoyedic populations, which shows that they have assimilated the original non-Uralic population of the Taimyr Peninsula. This is confirmed by the fact that they are genetically rather close to the Dolgans and the Tundra Yukaghirs (see Zeng et al. 2023: https://www.biorxiv.org/content/10.1101/2023.10.01.560332v2 ).

4. The history of the Nganasan language consists of two long-distance movements: first Ancient Uralic from the Urals to the Sayan Region, and second Proto-Samoyedic from the Sayan Region to the Taimyr Peninsula. In both stages the genetic composition of the language carriers has changed considerably, as we can see by comparing the Uralic populations: Samoyedic populations are different from the more western populations, and the Nganasans are different from other Samoyedic populations. Therefore, it is utterly impossible that the Nganasans could have preserved the genetic composition of the Proto-Uralic speakers.

To conclude: it is irrational and unscientific to claim that the Nganasans are the most similar population to the Proto-Uralic speakers, when they are among the most dissimilar Uralic populations. Such a grave error is based only on an unscientific belief that one can ignore the linguistic results and see language directly from DNA. Therefore, it is no wonder that the same people can also claim that Proto-Uralic was spoken in North-Central or even in Northeastern Siberia.
 
Note that Nganasans have also been identified as the most representative of the “Uralic” component in the DNA of Uralic people, which is also a strong distinguishing factor. This may be due to the fact that they are simply the least admixed people of the Uralic groups, having been pushed northward by other groups instead of staying put and mixing.

There is a good summary of the evolution of theories on Wikipedia: Proto-Uralic homeland.

In general, there are two factors we need to consider:

  • The age of of the Proto-Uralic language, being estimated as 4k-9k ybp;
  • Our knowledge of migration/admixture times and directions.

Basically, we have a range of locations west, east and south of the Ural mountains.

So, as to the genetic composition, one just needs to backtrack the admixture events in those areas since about 9k ybp :)

 
Here is an image from https://pmc.ncbi.nlm.nih.gov/articles/PMC5204334 (2017)

IMG_1720.jpeg


Geographical dispersion model. The approximate model of gene flows reflecting divergence of primary clades in Europe, East Asia, and Siberia with approximate divergence times based on Chromosome Y, MSMC, TreeMix, and D-statistics analyses. Percentages represent proportions of ancestry contributed to each lineage from other lineages. The common ancestors of Mansi, Khanty, and Nenets had derived 57% of their ancestry from an ANE population, related to MA-1, AG-2, and Native Americans and 43% from the admixture with a population related to Evens and Evenks. Native Americans trace 42% of their ancestry to ANE and 58% to a common ancestor of Eastern Siberians and East Asians. In the figure, 45 kya represents the split between Y-Chromosome clades QR and NO; 33 kya represents the split between Y-Chromosome Q and R clades and mtDNA N2 and W clades; 17 kya represents Mansi-Andean separation time; 44 kya represents the split of Ust’-Ishim's haplogroup from NO clade; 19 kya represents Han-Andean separation time; 10 kya represents Han-Eastern Siberian separation time; and 5.3–9.9 kya represents Evenk-Mansi MSMC separation time (6.8–9.9 kya) and expansion of N1c1 Y-haplogroup (5.3–7.1 kya). In this figure, Europeans represent a collective term for the following populations from this study: Komi, Veps, Karelians, Russians, and Belarussians.
 
Traveller:
Note that Nganasans have also been identified as the most representative of the “Uralic” component in the DNA of Uralic people, which is also a strong distinguishing factor. This may be due to the fact that they are simply the least admixed people of the Uralic groups, having been pushed northward by other groups instead of staying put and mixing.

Yes, there is a weird blind spot: why should the original population even be unmixed? Of course it could have been already an admixed population. Apparently, this fixation in the Siberian ancestry component is partly because this ancestry is what mainly separates the westernmost Uralic populations from their non-Uralic neighbors.

But that does not automatically mean that the same component separates the easternmost Uralic speakers from their neighbors. According to qpAdm results by Zeng et al. 2023, east from the Uralic speakers there appear the China ancestry and the Tyumen ancestry as the distinguishing components, as well as (nearly) total lack of EHG ancestry. So, indeed a weird blind spot from some geneticists to fixate their mind only on the Nganasan/Yakutia-like ancestry...

And as there is a Uralic continuum between close to 0 % Siberian ancestry in the west and close to 100 % in the east, why should one jump directly to the other end, when the original composition can be anywhere between the two ends? Only linguistic results can help to narrow down the possibilities.

Traveller:
Basically, we have a range of locations west, east and south of the Ural mountains.
So, as to the genetic composition, one just needs to backtrack the admixture events in those areas since about 9k ybp :)

Is it not quite difficult to back-track the development in the Central Ural Region, when we do not yet have ancient samples from there from the third and second millennia BC, when the admixture apparently occurred? Modern samples cannot give us such preciseness concerning time of admixture and proportions of ancestry components.
 
Last edited:
The Samoyedic languages (including Nganasan) are also considered to be a very old branch off proto-Uralic.

See Wikipedia citing phylogenic modelling of Honkola 2013.

Considering that these people currently inhabit the Arctic coast, it’s hard to see them somehow interacting a lot with different cultures, as opposed to most other Uralic people that have mixed with other ethnicities. It is much more plausible that they have migrated north due to pressure from other groups and largely retained their historic genetic composition.

The fact that the Nganasan don’t display ancestry from, say, EHG, also speaks for this. And, as Zeng demonstrated, there is huge overlap with the Yakutia_LNBA from 4.5 kybp that they consider the ancestral population to the Uralic people, meaning that there are few “other” ancestors.
 
Yes, there is a weird blind spot: why should the original population even be unmixed? Of course it could have been already an admixed population. Apparently, this fixation in the Siberian ancestry component is partly because this ancestry is what mainly separates the westernmost Uralic populations from their non-Uralic neighbors.

Of course each population is a mixture of its ancestral populations. But you have to keep in mind that you can only add these, not remove some genes from the gene pool (though old minority ancestries do get diluted).

The Siberian ancestry is carried by all the Uralic people, even those that haven’t had much contact with other Siberian people for thousands of years. So it has gotten diluted by other contacts post-Siberia and other migrations into the areas the Uralic people inhabit.
 
Interestingly, the PCA for Yakutia_LNBA overlaps with the Khatystyr Cave sample from 10 kybp. That location is much further east in Siberia.
 
The Samoyedic languages (including Nganasan) are also considered to be a very old branch off proto-Uralic.

See Wikipedia citing phylogenic modelling of Honkola 2013.

Considering that these people currently inhabit the Arctic coast, it’s hard to see them somehow interacting a lot with different cultures, as opposed to most other Uralic people that have mixed with other ethnicities. It is much more plausible that they have migrated north due to pressure from other groups and largely retained their historic genetic composition.

The fact that the Nganasan don’t display ancestry from, say, EHG, also speaks for this. And, as Zeng demonstrated, there is huge overlap with the Yakutia_LNBA from 4.5 kybp that they consider the ancestral population to the Uralic people, meaning that there are few “other” ancestors.

1. Computational phylogenetics/phylolinguistics (as they now like to say) has many problems in its methods, so its datings cannot challenge the traditional dating methods of historical linguistics. Proto-Uralic disintegration began ca. 2500 BCE, and also Samoyedic remained without branch-specific sound changes well to the 2nd millennium BC, starting to spread to the east only ca. 1500 BCE.

2. As I wrote, the Nganasans are different from even other Samoyeds, so the language carriers have clearly assimilated local population in the Northern Siberia. Even the Proto-Samoyedic population in the Sayan Region was not 100 % of Yakutia ancestry, if this is what you meant - and even much less the Proto-Uralic speakers in the Central Ural Region.

3. You: "And, as Zeng demonstrated, there is huge overlap with the Yakutia_LNBA from 4.5 kybp that they consider the ancestral population to the Uralic people, meaning that there are few “other” ancestors."
-- What are you trying to say here? As I wrote earlier, it is a blind spot to stare only the Yakutia ancestry. There are several other ancestries widely in the Uralic populations, according to Zeng et al.
 
Of course each population is a mixture of its ancestral populations. But you have to keep in mind that you can only add these, not remove some genes from the gene pool (though old minority ancestries do get diluted).

The Siberian ancestry is carried by all the Uralic people, even those that haven’t had much contact with other Siberian people for thousands of years. So it has gotten diluted by other contacts post-Siberia and other migrations into the areas the Uralic people inhabit.

Of course we can also remove ancestries - it happens by admixing with populations which has very little or not at all of certain ancestry.

Siberian ancestry was probably somehow taking part in the Uralic western expansion, or then it was a bit earlier or a bit later. We do not know that yet, and you cannot just decide that it spread together with the Uralic languages. Distribution alone cannot prove that - also time of the expansion is crucial piece of evidence.

There are of course dozens of possible migrations from east to west and from west to east. Uniparental markers already prove that. Only one of these many migrations spread the Uralic language. Which one it was, you cannot just decide it.
 
1. Computational phylogenetics/phylolinguistics (as they now like to say) has many problems in its methods, so its datings cannot challenge the traditional dating methods of historical linguistics. Proto-Uralic disintegration began ca. 2500 BCE, and also Samoyedic remained without branch-specific sound changes well to the 2nd millennium BC, starting to spread to the east only ca. 1500 BCE

This is putting one theory against another. If Nganasan and Enets started their migration north/east from central Ural, how do you explain their Mongolia_N ancestry while lacking China_YR ancestry that the Tungusic groups have?

Also note Yukagirs far to the east whose language displays ancient contact with proto-Uralic.

2. As I wrote, the Nganasans are different from even other Samoyeds, so the language carriers have clearly assimilated local population in the Northern Siberia.

I doubt if there was anybody to assimilate.

The Zeng qpAdm models project that the Nganasan derive ca 70% of their ancestry from Yakutia_LNBA and 20% from Mongolia_N. See your own composition from muinaissuomi:

Zeng23_qpAdm_Uralic.png


Of course we can also remove ancestries - it happens by admixing with populations which have very little or not at all of certain ancestry.

That is diluting other ancestries, but doesn’t make them disappear. I’ve never noticed anyone claiming that all trace of a parental population can be removed. Can you give an example?

Siberian ancestry was probably somehow taking part in the Uralic western expansion, or then it was a bit earlier or a bit later. We do not know that yet, and you cannot just decide that it spread together with the Uralic languages. Distribution alone cannot prove that - also time of the expansion is crucial piece of evidence.

You don’t believe there were Siberian people speaking Uralic languages that migrated westwards?

Even the Proto-Samoyedic population in the Sayan Region was not 100 % of Yakutia ancestry, if this is what you meant - and even much less the Proto-Uralic speakers in the Central Ural Region.

3. You: "And, as Zeng demonstrated, there is huge overlap with the Yakutia_LNBA from 4.5 kybp that they consider the ancestral population to the Uralic people, meaning that there are few “other” ancestors."
-- What are you trying to say here? As I wrote earlier, it is a blind spot to stare only the Yakutia ancestry. There are several other ancestries widely in the Uralic populations, according to Zeng et al.

I’m trying to say that Nganasan and tundra Yukagirs even more so are direct descendants of the Yakutia_LNBA people and have had very little recent mixing with other groups.

To take it further, it’s possible they are among the last remnants of the original inhabitants of central Siberia, pushed to the fringes by the Tungus expansion.
 
Traveller:
“This is putting one theory against another. If Nganasan and Enets started their migration north/east from central Ural, how do you explain their Mongolia_N ancestry while lacking China_YR ancestry that the Tungusic groups have?”

There is one dating leaning on solid historical linguistic evidence, and then there is another theory based on calculations based on word counting alone. The latter is no match for the former. Read the link above and again below.

As I wrote, Samoyeds did not spread east or northeast but to the southeast. Proto-Samoyedic homeland was in the Sayan Region, in Southern Siberia. This is undeniable: there are old loanwords from Iranian, Turkic and Yeniseian in Proto-Samoyedic. The Mongolia ancestry came along in there. Only ca. 2000 years ago the northward expansion of Samoyedic began. Nganasan cannot have been spoken in Taimyr much longer than one millennium.

Traveller:
“Also note Yukagirs far to the east whose language displays ancient contact with proto-Uralic.”

Tundra Yukaghirs probably represent the older local population, which later in Taimyr adopted a Samoyedic language and admixed with Proto-Samoyedic speakers coming from the south. We know that Yukaghir had contacts with Pre-Proto-Samoyedic, but beyond that, there are no conclusive results about the relationship of Uralic and Yukaghir.

Traveller:
“I doubt if there was anybody to assimilate.”

Of course everywhere were people to assimilate, the lands have never been empty since the Ice Age. Just compare the shared ancestries between Nganasans, Dolgans, and Tundra Yukaghirs: at least the first two of these populations are for a big part language shifters, because Samoyedic and Turkic are late newcomers so north.

Traveller:
“The Zeng qpAdm models project that the Nganasan derive ca 70% of their ancestry from Yakutia_LNBA and 20% from Mongolia_N. See your own composition from muinaissuomi:”

Nganasans have at least four different ancestries, including the European one. So, what do you try to say?

Traveller:
“That is diluting other ancestries, but doesn’t make them disappear. I’ve never noticed anyone claiming that all trace of a parental population can be removed. Can you give an example?”

If there are many enough steps of movement and long enough time, that is possible. Modern Hungarians do not have any Yakutia ancestry, even though ancient Hungarian conquerors still had it 1000 years ago. And the “Proto-Hungarians” in the Ural Region had it still more.

Traveller:
“You don’t believe there were Siberian people speaking Uralic languages that migrated westwards?”

Siberian people? The Tundra Nenets were Siberian people who spoke a Uralic language and spread to the west after 1000 CE, reaching the Kanin Peninsula in 500 years or so. But Proto-Uralic speakers were not Siberian people, because their homeland was in the Central Ural Region.
https://journal.fi/fuf/article/view/120910

Distant Pre-Proto-Uralic might have been spoken in Siberia, or in Europe. There are contradicting linguistic results, so it is impossible to choose between those views – they might all be wrong. Indo-Uralic features against Uralo-Siberian, Ural-Altaic etc. features. They all cannot be true… Unless distant Pre-Proto-Indo-European also developed in Siberia.

Traveller:
“I’m trying to say that Nganasan and tundra Yukagirs even more so are direct descendants of the Yakutia_LNBA people and have had very little recent mixing with other groups.”

Even Tundra Yukaghirs have a little Srubnaya ancestry, but they seem to mainly represent the original genetic population there. But Nganasans have at least four different ancestries (including some Srubnaya), so how can you say that there is very little recent admixture? It is easy to model Nganasans as an admixture of Proto-Samoyedic population coming from the south and local North Siberian population shifting their language.
 
Traveller:
“That is diluting other ancestries, but doesn’t make them disappear. I’ve never noticed anyone claiming that all trace of a parental population can be removed. Can you give an example?”

If there are many enough steps of movement and long enough time, that is possible. Modern Hungarians do not have any Yakutia ancestry, even though ancient Hungarian conquerors still had it 1000 years ago. And the “Proto-Hungarians” in the Ural Region had it still more.

In Hungary it was only the conquering elite that had Siberian ancestry. The greater population was and remained central European.

 
See also Grünthal et al 2022 on arguments for a Siberian Proto-Uralic homeland and the role of Uralic people as ST traders and miners in the wider context of the 4.2 kya event.


There are no valid arguments at all concerning LATE Proto-Uralic, they are all considered in the article I linked several times. Please read it carefully. After that, if you still can present even one valid argument requiring the location of Late Proto-Uralic in Southern Siberia, you are welcome to show it here.
 
Last edited:
There has been previous discussion on a type of autosomal drift, similar in behavior to Balto-Slavic drift, common to all Uralic-speaking populations, and which is especially pronounced in Mari and Chuvash people.

Using the distance measure on the available Mari samples, you'll essentially get a list of Uralic-speaking populations, the few exceptions being recently turkified or russified populations.

Code:
Distance to:    Mari:GRC11056594_Mari08
0.01767121    Mari
0.03858502    Chuvash
0.06615014    Udmurt
0.06889698    Besermyan
0.07499431    Khanty_o1
0.07542167    Tatar_Kazan
0.07623687    Saami
0.07793386    Komi_A
0.07917562    Tatar_Siberian_Zabolotniye
0.07968504    Bashkir
0.08094375    Saami_Kola
0.08110366    Russian_Leshukonsky
0.08157488    Tatar_Siberian
0.08238163    Komi_B
0.08285777    Mansi
0.08328940    Tatar_Mishar
0.08610433    Russian_Pinezhsky
0.08614224    Russian_Pinega
0.08898766    Finnish_East
0.08934750    Vepsian
0.08999249    Karelian
0.09004969    Russian_Kostroma
0.09051361    Russian_Krasnoborsky
0.09053628    Khanty
0.09056870    Moksha

However, in this case you really need to make sure to use unscaled coordinates, because the drift is concentrated on the higher principal components (especially PC19, PC20 and PC23), which are strongly penalized in the scaled version of G25. Otherwise the results will be similarly as accurate as using Nganasan as reference.

In practice, this component can be perfectly used to identify Uralic samples, both modern and ancient.
 
Last edited:
Quint:
"In practice, this component can be perfectly used to identify Uralic samples, both modern and ancient."

Wow, the correlation with the former Uralic regions is indeed striking, if no populations have been left outside the list. Erzyas and Estonians are not there, but genome-widely they have been the most "Russian-like" Uralic populations. Also all Samoyedic populations are lacking, as well as Hungarians.

So, the correlation with the Uralic language family is not perfect; it matches well a part of it: the northwestern route from Middle Volga to Finland and Lapland. Southwestern (Finnic) and southeastern (historically: Hungarian, Proto-Samoyedic) do not have so well correspondence.
 
Wow, the correlation with the former Uralic regions is indeed striking, if no populations have been left outside the list. Erzyas and Estonians are not there, but genome-widely they have been the most "Russian-like" Uralic populations. Also all Samoyedic populations are lacking, as well as Hungarians.

Those were just the first 25 populations, which is the standard setting. Erzya rank 28th, Estonians are at position 51 and Hungarians at 53, with a host of other Finnic, Turkic and some other East European populations in between. Beyond that point, the results become more varied.

It seems Samoyedics don't appear to be close because their additional Siberian ancestry tends to counteract this Mari drift, which happens often in G25. But they have this ancestry as well.
 
It seems Samoyedics don't appear to be close because their additional Siberian ancestry tends to counteract this Mari drift, which happens often in G25. But they have this ancestry as well.

What "this ancestry", and how can you tell that, it it does not show in that method?
 
Back
Top