• Don't want to see ads? Install an adblocker like uBlock Origin or use a Europe-based privacy-friendly browser like Vivaldi or Mullvad.

History Horse genetics, archaeology, and the beginning of riding

Tautalus

Regular Member
Messages
545
Reaction score
1,380
Points
93
Ethnic group
Portuguese
Y-DNA haplogroup
I2-M223 / I-FTB15368
mtDNA haplogroup
H6a1b2y
This paper by David Anthony, Martin Trautmann, and Volker Heyd, is a major rebuttal to recent genetic studies (1,2,3,4) that argued horse domestication and horseback riding only became historically significant after ~2200–2100 BCE, meaning that the Yamnaya migrations were not carried out on horseback.​

The authors argue instead that:
  • horse domestication was gradual and regionally diverse;
  • horses from several genetic lineages were already managed, milked, and ridden centuries earlier;
  • the Yamnaya culture probably used horseback riding during their massive Eurasian expansions;
  • and the later dominance of the DOM2 lineage reflects a successful breeding expansion, not the original invention of domestication.

The central disagreement

Recent genetics papers proposed that “true” domestication began only when horses in the DOM2 lineage acquired mutations linked to:
  • calmer temperament,
  • reduced fear,
  • and better endurance for riding.
Those studies suggested:
  • earlier horse management was limited “taming,”
  • widespread riding came later,
  • and Yamnaya migrations across Eurasia were driven mainly by wagons rather than mounted riders.
Anthony and colleagues reject this narrow definition of domestication.

Three horse populations

The paper identifies three major horse lineages active during the 4th millennium BCE:
  • DOM1 — Botai horses in Kazakhstan;
  • DOM2 — Pontic-Caspian steppe horses associated with Yamnaya peoples and ancestors of modern horses;
  • DOM3 — indigenous Central/Southeast European horses that may also have been locally domesticated.
The authors argue that all three show evidence of management before 2200 BCE.

Evidence from Botai (DOM1)

The Botai culture (~3500–3100 BCE) in Kazakhstan is presented as strong evidence for early horse domestication:
  • horses made up almost the entire diet;
  • evidence suggests corralling and manure disposal;
  • mare’s milk was likely consumed;
  • horse teeth show wear patterns consistent with rope bits used in riding.
The authors argue Botai horses were clearly domesticated rather than simply hunted wild animals.

Evidence for Yamnaya horse riding (DOM2)

The paper strongly argues that Yamnaya groups rode horses long before 2200 BCE.

The evidence includes:
  • horse milk proteins preserved in Yamnaya dental calculus;
  • horse bones in settlements, rituals, and graves;
  • human skeletal pathologies associated with habitual riding;
  • and genetics showing Yamnaya horses belonged to the DOM2 lineage ancestral to modern domestic horses.
The authors connect this to the enormous Yamnaya migrations across Eurasia between ~3200–2600 BCE, which transformed European ancestry and spread Indo-European languages. They argue horseback riding plausibly contributed to this mobility.

Central and Southeast Europe (DOM3)

The paper also argues for horse management in Central and Southeast Europe before 3000 BCE:
  • horses became larger and more variable in size,
  • domesticated coat-color genes appeared,
  • and some horses may have been used for transport or riding.
The authors interpret this as evidence for local experimentation with horse domestication independent of the steppe.

What the paper says about the Corded Ware culture (CWC)

A major section of the paper focuses on the Corded Ware culture and whether steppe migrants brought ridden horses into Europe.

Recent genetics studies had argued:
  • CWC people carried massive Yamnaya-derived human ancestry into Europe,
  • but CWC horses supposedly lacked DOM2 steppe-horse ancestry,
  • therefore Yamnaya migrations were probably not driven by horse riders.
Anthony and colleagues dispute this conclusion.

The CWC horse sample problem

The claim rested mainly on four CWC horses from a ritual site at Hohler Stein near Schwabthal in Bavaria.

The paper argues:
  • the sample was extremely small,
  • the site was unusual and ritual in character,
  • and the horses may represent a single ceremonial event rather than normal CWC horse use.
Because of this, the authors argue these horses cannot represent all CWC horses across Europe.

Reanalysis of the genetics

The paper explains that later reanalysis of the same horse genomes showed:
  • several genetic models fit the data equally well,
  • some models actually suggest ~21% DOM2 ancestry in CWC horses,
  • and the earlier conclusion of “no DOM2 ancestry” is therefore uncertain.
The authors conclude that current genetic evidence:
  • does not prove CWC horses lacked steppe ancestry,
  • and therefore does not disprove Yamnaya horseback riding.

Horses in CWC society

Horse remains are regularly found at CWC settlements, though usually in small numbers.
The paper interprets this to mean:
  • horses were likely prestige animals,
  • potentially used for transport, riding, or elite status,
  • rather than ordinary livestock.

Genetics and the “domestication bottleneck”

The paper accepts the genetic evidence that DOM2 horses expanded dramatically after ~2200 BCE, but interprets it differently.
Instead of marking the beginning of domestication, the authors see this as:
  • a later intensification of breeding and selection,
  • a genetic bottleneck where one successful horse lineage became dominant,
  • and the culmination of a domestication process already underway for centuries.
They also note that genetic selection for riding-related traits appears to begin earlier, during or before the Yamnaya period.

Riding evidence from human skeletons

The paper strongly defends osteological evidence for early riding:
  • specific combinations of hip, thigh, and spinal changes in human skeletons are argued to reflect habitual horseback riding;
  • several pre-Yamnaya and Yamnaya individuals exhibit these traits.
The authors emphasize that no single skeletal marker proves riding, but multiple markers together make horseback riding the most plausible explanation.

Overall conclusion

The paper’s overall thesis is that:
  • horse domestication began well before 2200 BCE;
  • riding probably emerged during the 4th millennium BCE or earlier;
  • Yamnaya groups likely rode horses;
  • CWC evidence does not invalidate that interpretation;
  • and the later spread of DOM2 horses represents the success of one breeding lineage rather than the origin of domestication itself.

l0tVpcS.png


Abstract
Recent papers argued that the domestication of horses can be equated with the appearance of favorable genetic mutations that are first evident in individuals in the DOM2 clade dated about ∼2200–2100 BCE. We challenge the idea that this genetic shift alone defines domestication. Evidence from archaeology, ancient DNA, osteology, and other disciplines shows that horses from multiple genetic backgrounds (DOM1, DOM2, and, as we suggest here, DOM3) were managed, milked, and ridden long before 2200 BCE. Yamnaya groups (∼3200–2600 BCE) rode DOM2 horses—the direct ancestors of modern domestic stock—while incorporating them into diets, rituals, and mobility systems. Selection for traits linked to endurance and temperament began centuries earlier. Rather than a sudden breakthrough, domestication was a protracted, regionally varied process whose transformative effects on human mobility and social organization began as early as the fourth, if not the fifth millennium BCE, and set the stage for later DOM2 dominance.​

The general distributions of DOM1, DOM2, and DOM3 horses at ∼3500–3000 BCE, and locations of the main sites mentioned in the text.
Pre-Yamnaya DOM3 horse aDNA came from sites marked with a cross; Yamnaya and pre-Yamnaya humans with rider syndrome, as listed in Trautmann et al. (16), were from graves marked with a square; DOM2 Steppe Neolithic and Eneolithic horse aDNA samples (NEONCAS group) were from #15 to #19; DOM2 Steppe Yamnaya and related horses (CPONT and TURG groups) were from #19 and #22 to #23; the genetic ancestry of number #28 is unknown. Geographic Information System (GIS)-based map created by L. Vyazov. Sites: 1, Schwabthal-Hohler Stein; 2, Salzmünde; 3, Cham; 4, Stránská skála; 5, Pietrele; 6, Csepel; 7, Vetrino; 8, Medgidia; 9, Dévaványa; 10, Balmazújváros; 11, Malomirovo; 12, Strejnicu; 13, Blejoi; 14, Csongrád; 15, Deriivka; 16, Semenivka; 17, Varfolomeevka; 18, Oroshaemoe; 19, Turganik; 20, Khvalynsk; 21, Razdol’noe; 22, Aygurskii; 23, Repin; 24, Tsatsa; 25, Kriviansky IX; 26, Mykhailivka; 27, Botai; 28, Nizhnyaya Sooru; 29, Borly4. SE, Southeast.​
WRFGeTZ.jpg
 
Last edited:
Honestly, I don’t think the article adds anything genuinely new to the origin of horse domestication. It looks more like yet another attempt to force the Yamnaya narrative to fit.

Not only does it fail to add new samples, but it also completely omits all the material introduced in the 2025 Iberian horse paper, and is essentially just a reinterpretation of Librado et al. (2021–2024) at a very poor technical level.


I’ve spent several hours studying the supplementary datasets from all these studies, and this is my summary and conclusions from all of them:

In these papers they constantly talk about “lineages,” but “DOM” is not a lineage — it is an autosomal admixture cluster, a region within the PCA.

Inside that cluster, all horses belong to haplogroups D*, DA*, DB*, and their descendants.
These are three major branches that split between 4500–3800 BC.

The horses they are now trying to redefine as “DOM3” mostly do not even belong to haplogroup D>. They are all extinct lineages that would phylogenetically fall between the P and D branches.

If those intermediate lineages were assigned a haplogroup, they would probably be classified as P2*, corresponding to the extinct Tarpans.

The fact that they now claim CWC horses had 20% DOM2 admixture is quite “enigmatic,” because in Librado’s paper even the IBE horses show more DOM2 than they do. According to two different studies, CWC horses carried between 0–5% DOM2 autosomal admixture.

IMG_9149.jpeg

IMG_9146.jpeg

IMG_9144.jpeg


The CWC people (R1A-M417) were not Yamnaya (R1B-Z2103). They belonged to a parallel cultural horizon and did not migrate on horseback. They moved using oxen pulling enormous solid-wheel wagons.

The ponies of that period were not suitable for such purposes, and even if they possessed domesticated horses, they would not have used them for that.

The Hungarian horses dated after 2500 BC do show haplogroups D and around 30% DOM2 autosomal admixture, but no modern horse descends from those lineages, and they disappeared after 1200 BC.

R1A groups never massively expanded beyond the Rhine, and Z2103 groups never massively expanded beyond the Balkans. In Czechia, which marks roughly their westernmost extent, they are only around 5% today.


The oldest known specimen with a high proportion of DOM2 autosomal admixture (around 60%) is:

Rocas12_Rom from 3800 BC (Romania).

P0>D>DA*.

3
IMG_9143.jpeg

That specimen is empirically the closest known example to the DA* bottleneck ancestral to all modern horses, both in terms of autosomes and Y-haplogroup.

All other haplogroups became marginalized after 1200 BC, when the Hispanic horses belonging to haplogroup D>DA1>DAC became the dominant sire lineage.

Today, the specimens CDL16_Spa_m100 (D2) and CL19x31_Por_m314 (D2) are the closest known representatives to the DAC haplogroup bottleneck from which all modern stallions descend.

Before this, the oldest documented DAC* specimen was one from Eastern Roman Anatolia around 500 AD.

IMG_9145.jpeg


In the Iberian horse study, one fact especially caught my attention.
The sacrificial horses from Casas del Turuñuelo dated to 500 BC (southern Extremadura, Tartessian culture) included around 40 equines gathered from multiple locations across Extremadura according to strontium analysis.

Among them were the earliest documented horses reaching 160 cm at the withers. All of them were 100% DOM2 and carried the haplogroup P0>P>D*.

In addition, one horse from Italy and another from England related to these specimens also carried this pre-DA* haplogroup, meaning they had no phylogenetic relationship with the steppe variants, which after 2000 BC were predominantly DA and DB.






Current horse Y-haplogroup frequencies:

IBE horses (semi-feral Iberian horses with ~10% DOM2)

P0* (extinct; some may possibly survive among northern feral horses with archaic phenotypes)

DOM1 horses (Botai)

P0>P>Pd (very minor, Przewalski horses)
P0>P>D (extinct)

DOM2 horses (Balkans and steppes)

P0>P>D (extinct)
P0>P>D>DA (extinct)
P0>P>D>DB (extinct)
P0>P>D>DA1 (minor but very widespread)

DOM3 horses (Eastern and Central Europe)

P0>P>P2* (extinct, associated with Tarpans)

Elite horses:
P0>P>D>DA>DA1>DA2>DA3>DAC (90% of modern horses)


According to Librado et al.’s supplementary data, horses lost roughly half of their total autosomal allelic diversity between 10,000–6000 BC.

This likely means that fully wild horses became effectively extinct during that period, and what survived was never the same again — from that point onward, all surviving horses were essentially “semi-wild” and mainly bred for meat.

In the Iberian Peninsula there is evidence of horse bits and mandibular wear marks dating back to the Neolithic.
The beginning of domestication is one thing; the later trajectory of sire lineages is something entirely different.

IMG_9148.jpeg


Summary of domestication phases:

Phase 0
<10,000 BC — fully wild horses.

Phase 1
(Semi-wild phase)
10,000–6000 BC — autosomal bottleneck reducing total allelic diversity by more than half in each specimen.

Phase 2
6000–2000 BC — earliest semi-domestication mainly for meat and milk.

Phase 3
First bottleneck replacing haplogroups D*, DA*, and DB* (99% of all modern horses)
2000–1000 BC — first war ponies standing 130–145 cm at the withers, used for chariots.

Phase 4
(Specialized domestic war horses)
After 1200 BC — first horses truly functional for warfare and direct riding, standing 150–160 cm at the withers.

Phase 5
(Elite horses)
Second bottleneck: D>DA>DA1>DAC (90% of all modern horses)

After 300 BC — earliest documented DAC haplogroup specimens, 155–165 cm, patriarchal lineage of all modern elite horses.

Sensationalist summary exaggerating empirical facts in order to provoke those still defending massive Yamnaya horse-rider conquest migrations between 3000–2500 BC — migrations that in reality never existed:

“The Hispanics domesticated the elite horse.”

“All modern elite horses descend from a single Iron Age horse from pre-Roman Hispania.”

“The patriarchal horse lineage that forged the Roman Empire was born in Hispania.”


More realistic summary:

“Current evidence points toward the Hispanics as the domesticators of the final elite horse phenotype, pending more Balkan, Aegean, and Italic samples.”

“Humans required at least 3,000 years of intense artificial selection on Y-lineages to transform archaic horses into the modern phenotype.”

“The origin of horse domestication was never in the steppes. Y-chromosome phylogeny points instead toward the eastern Balkans a thousand years earlier than proposed in previous papers, while the final bottleneck points toward Hispania.”
 

Honestly, I don’t think the article adds anything genuinely new to the origin of horse domestication. It looks more like yet another attempt to force the Yamnaya narrative to fit.

Not only does it fail to add new samples, but it also completely omits all the material introduced in the 2025 Iberian horse paper, and is essentially just a reinterpretation of Librado et al. (2021–2024) at a very poor technical level.


I’ve spent several hours studying the supplementary datasets from all these studies, and this is my summary and conclusions from all of them:

In these papers they constantly talk about “lineages,” but “DOM” is not a lineage — it is an autosomal admixture cluster, a region within the PCA.

Inside that cluster, all horses belong to haplogroups D*, DA*, DB*, and their descendants.
These are three major branches that split between 4500–3800 BC.

The horses they are now trying to redefine as “DOM3” mostly do not even belong to haplogroup D>. They are all extinct lineages that would phylogenetically fall between the P and D branches.

If those intermediate lineages were assigned a haplogroup, they would probably be classified as P2*, corresponding to the extinct Tarpans.

The fact that they now claim CWC horses had 20% DOM2 admixture is quite “enigmatic,” because in Librado’s paper even the IBE horses show more DOM2 than they do. According to two different studies, CWC horses carried between 0–5% DOM2 autosomal admixture.





The CWC people (R1A-M417) were not Yamnaya (R1B-Z2103). They belonged to a parallel cultural horizon and did not migrate on horseback. They moved using oxen pulling enormous solid-wheel wagons.

The ponies of that period were not suitable for such purposes, and even if they possessed domesticated horses, they would not have used them for that.

The Hungarian horses dated after 2500 BC do show haplogroups D and around 30% DOM2 autosomal admixture, but no modern horse descends from those lineages, and they disappeared after 1200 BC.

R1A groups never massively expanded beyond the Rhine, and Z2103 groups never massively expanded beyond the Balkans. In Czechia, which marks roughly their westernmost extent, they are only around 5% today.


The oldest known specimen with a high proportion of DOM2 autosomal admixture (around 60%) is:

Rocas12_Rom from 3800 BC (Romania).

P0>D>DA*.

3

That specimen is empirically the closest known example to the DA* bottleneck ancestral to all modern horses, both in terms of autosomes and Y-haplogroup.

All other haplogroups became marginalized after 1200 BC, when the Hispanic horses belonging to haplogroup D>DA1>DAC became the dominant sire lineage.

Today, the specimens CDL16_Spa_m100 (D2) and CL19x31_Por_m314 (D2) are the closest known representatives to the DAC haplogroup bottleneck from which all modern stallions descend.

Before this, the oldest documented DAC* specimen was one from Eastern Roman Anatolia around 500 AD.



In the Iberian horse study, one fact especially caught my attention.
The sacrificial horses from Casas del Turuñuelo dated to 500 BC (southern Extremadura, Tartessian culture) included around 40 equines gathered from multiple locations across Extremadura according to strontium analysis.

Among them were the earliest documented horses reaching 160 cm at the withers. All of them were 100% DOM2 and carried the haplogroup P0>P>D*.

In addition, one horse from Italy and another from England related to these specimens also carried this pre-DA* haplogroup, meaning they had no phylogenetic relationship with the steppe variants, which after 2000 BC were predominantly DA and DB.






Current horse Y-haplogroup frequencies:

IBE horses (semi-feral Iberian horses with ~10% DOM2)

P0* (extinct; some may possibly survive among northern feral horses with archaic phenotypes)

DOM1 horses (Botai)

P0>P>Pd (very minor, Przewalski horses)
P0>P>D (extinct)

DOM2 horses (Balkans and steppes)

P0>P>D (extinct)
P0>P>D>DA (extinct)
P0>P>D>DB (extinct)
P0>P>D>DA1 (minor but very widespread)

DOM3 horses (Eastern and Central Europe)

P0>P>P2* (extinct, associated with Tarpans)

Elite horses:
P0>P>D>DA>DA1>DA2>DA3>DAC (90% of modern horses)


According to Librado et al.’s supplementary data, horses lost roughly half of their total autosomal allelic diversity between 10,000–6000 BC.

This likely means that fully wild horses became effectively extinct during that period, and what survived was never the same again — from that point onward, all surviving horses were essentially “semi-wild” and mainly bred for meat.

In the Iberian Peninsula there is evidence of horse bits and mandibular wear marks dating back to the Neolithic.
The beginning of domestication is one thing; the later trajectory of sire lineages is something entirely different.



Summary of domestication phases:

Phase 0
<10,000 BC — fully wild horses.

Phase 1
(Semi-wild phase)
10,000–6000 BC — autosomal bottleneck reducing total allelic diversity by more than half in each specimen.

Phase 2
6000–2000 BC — earliest semi-domestication mainly for meat and milk.

Phase 3
First bottleneck replacing haplogroups D*, DA*, and DB* (99% of all modern horses)
2000–1000 BC — first war ponies standing 130–145 cm at the withers, used for chariots.

Phase 4
(Specialized domestic war horses)
After 1200 BC — first horses truly functional for warfare and direct riding, standing 150–160 cm at the withers.

Phase 5
(Elite horses)
Second bottleneck: D>DA>DA1>DAC (90% of all modern horses)

After 300 BC — earliest documented DAC haplogroup specimens, 155–165 cm, patriarchal lineage of all modern elite horses.

Sensationalist summary exaggerating empirical facts in order to provoke those still defending massive Yamnaya horse-rider conquest migrations between 3000–2500 BC — migrations that in reality never existed:

“The Hispanics domesticated the elite horse.”

“All modern elite horses descend from a single Iron Age horse from pre-Roman Hispania.”

“The patriarchal horse lineage that forged the Roman Empire was born in Hispania.”


More realistic summary:

“Current evidence points toward the Hispanics as the domesticators of the final elite horse phenotype, pending more Balkan, Aegean, and Italic samples.”

“Humans required at least 3,000 years of intense artificial selection on Y-lineages to transform archaic horses into the modern phenotype.”

“The origin of horse domestication was never in the steppes. Y-chromosome phylogeny points instead toward the eastern Balkans a thousand years earlier than proposed in previous papers, while the final bottleneck points toward Hispania.”

You are consistent and persistent in your opposition to the steppe migration theory, but the main weakness of many of your arguments lies, on the one hand, in the denial or minimization of the archaeological and genetic importance of the Pontic-Caspian steppe, and on the other hand, in the excessive importance given to the Y chromosome and its bottlenecks.

The conclusion that “the origin of horse domestication was never in the steppes” is not supported by the current archaeological and genetic evidence. The steppe evidence cannot be dismissed simply because later paternal bottlenecks occurred elsewhere. The Pontic-Caspian steppe still provides some of the earliest strong evidence for widespread horse management, riding-related pathologies, horse milk consumption, and the expansion of the DOM2 ancestry cluster that later dominates most domestic horses. Anthony, Trautmann, and Heyd argue that domestication and riding began before the later bottlenecks, not that modern elite horses already existed in Yamnaya times.

Your statement that “massive Yamnaya horse-rider conquest migrations never existed” also is not supported by the current archaeological and genetic evidence. The archaeological and genetic evidence overwhelmingly supports major Yamnaya derived migrations across Europe. The real debate concerns the role horses played in those movements, not whether the migrations happened at all. Early horseback riding does not require large elite warhorses. Many highly mobile societies historically used relatively small horses very effectively. The Mongols under Gengis Khan are a classical example.

You overstates what Y-chromosome phylogeny can tell us about the origin of domestication itself. Y-lineages are extremely vulnerable to selective breeding bottlenecks because humans often reproduce only a few favored stallions while continually incorporating mares from many populations. A later successful sire lineage therefore does not necessarily identify the original domestication center. The fact that many modern horses descend from later DA derived lineages associated with Iberia does not prove that domestication originated there, any more than the dominance of a modern human Y-haplogroup identifies the birthplace of humanity. What it shows is that certain paternal lines became disproportionately successful later in history.

The most convincing interpretation is therefore that horse domestication unfolded in stages across multiple interconnected regions. Early management and riding probably emerged among steppe and neighbouring populations during the 4th millennium BCE or earlier. Later Bronze Age societies intensified selective breeding for transport and chariotry. Iron Age and Mediterranean populations, possibly including Iberian groups, then contributed heavily to the final paternal bottlenecks that produced the elite horse lineages dominating modern breeds. In other words, the later success of Iberian associated sire lines does not erase the earlier steppe contribution, it represents a later chapter in a much longer domestication process.​
 
Last edited:
You oppose the theory of steppe migration, but the combined archaeological and ancient DNA evidence strongly supports large scale migrations ultimately derived from Pontic–Caspian steppe populations during the Late Neolithic and Early Bronze Age. What it does not support is a simplistic model in which every later steppe derived population was merely a direct continuation of the classic Yamnaya horizon.

The Yamnaya cultural horizon (~3300–2600 BCE) was heavily associated with R1b-Z2103 paternal lineage, yet the major later expansions into different parts of Europe were carried by populations that had already diverged genetically and culturally from the earliest sampled Yamnaya groups. Corded Ware populations were overwhelmingly associated with R1a-M417, while Bell Beaker groups in western and central Europe became dominated by R1b-P312 downstream of R1b-L51. Early sampled Yamnaya males did not carry either R1a-M417 or R1b-P312. This means the later European expansions were not literal demographic continuations of specific sampled Yamnaya clans.

However, this does not invalidate the broader steppe migration model. Genome wide ancient DNA consistently shows that Corded Ware and Bell Beaker populations derived a very large proportion of their ancestry from populations genetically closest to Yamnaya steppe pastoralists. This is why scholars continue to describe these processes as “Yamnaya derived” or “steppe derived” migrations.

After ~3000 BCE, Europe experienced one of the largest ancestry turnovers ever documented in ancient DNA studies. Corded Ware individuals often derive roughly 70–75% of their ancestry from steppe related populations. Earlier Neolithic farmer associated Y-chromosome lineages declined sharply while new paternal lineages spread across enormous territories. Archaeology independently supports these demographic shifts through shared burial customs, kurgan traditions, pastoral economies, wagon technology and horse use. Linguistically, the spread of many Indo-European languages is also broadly consistent with these movements.

The evidence does not suggest that Yamnaya themselves directly conquered all of Europe. Rather, it supports major migrations carried out by populations descended from or closely related to steppe pastoralists ancestral to later Corded Ware and Bell Beaker groups. That distinction matters because Corded Ware likely formed from earlier steppe derived populations related to, but not identical with, classic Yamnaya. Bell Beaker was also culturally heterogeneous and involved substantial local admixture. Different parts of Europe experienced different demographic processes.

This is also why Y-haplogroups alone cannot define an entire population or migration event. Populations can preserve strong autosomal continuity with earlier steppe ancestors even while dominant paternal lineages shift through founder effects, elite dominance, clan expansion, or breeding bottlenecks. The fact that classic Yamnaya males were not R1a-M417 or R1b-P312 therefore does not disprove the broader steppe migration model. It simply suggests that multiple branches of steppe derived populations expanded in different directions and that currently sampled Yamnaya groups do not represent the full diversity of the wider steppe world ancestral to later Europeans.

The mainstream archaeogenetic position today is that there were major steppe derived migrations into Europe during the 3rd millennium BCE with enormous demographic impact, but Corded Ware and Bell Beaker were not simply identical copies of Yamnaya society. The relationship is best understood as descent from Yamnaya-like or steppe related ancestral populations rather than literal continuity in every genetic, cultural, or paternal lineage detail.​
 
You oppose the theory of steppe migration, but the combined archaeological and ancient DNA evidence strongly supports large scale migrations ultimately derived from Pontic–Caspian steppe populations during the Late Neolithic and Early Bronze Age. What it does not support is a simplistic model in which every later steppe derived population was merely a direct continuation of the classic Yamnaya horizon.

The Yamnaya cultural horizon (~3300–2600 BCE) was heavily associated with R1b-Z2103 paternal lineage, yet the major later expansions into different parts of Europe were carried by populations that had already diverged genetically and culturally from the earliest sampled Yamnaya groups. Corded Ware populations were overwhelmingly associated with R1a-M417, while Bell Beaker groups in western and central Europe became dominated by R1b-P312 downstream of R1b-L51. Early sampled Yamnaya males did not carry either R1a-M417 or R1b-P312. This means the later European expansions were not literal demographic continuations of specific sampled Yamnaya clans.

However, this does not invalidate the broader steppe migration model. Genome wide ancient DNA consistently shows that Corded Ware and Bell Beaker populations derived a very large proportion of their ancestry from populations genetically closest to Yamnaya steppe pastoralists. This is why scholars continue to describe these processes as “Yamnaya derived” or “steppe derived” migrations.

After ~3000 BCE, Europe experienced one of the largest ancestry turnovers ever documented in ancient DNA studies. Corded Ware individuals often derive roughly 70–75% of their ancestry from steppe related populations. Earlier Neolithic farmer associated Y-chromosome lineages declined sharply while new paternal lineages spread across enormous territories. Archaeology independently supports these demographic shifts through shared burial customs, kurgan traditions, pastoral economies, wagon technology and horse use. Linguistically, the spread of many Indo-European languages is also broadly consistent with these movements.

The evidence does not suggest that Yamnaya themselves directly conquered all of Europe. Rather, it supports major migrations carried out by populations descended from or closely related to steppe pastoralists ancestral to later Corded Ware and Bell Beaker groups. That distinction matters because Corded Ware likely formed from earlier steppe derived populations related to, but not identical with, classic Yamnaya. Bell Beaker was also culturally heterogeneous and involved substantial local admixture. Different parts of Europe experienced different demographic processes.

This is also why Y-haplogroups alone cannot define an entire population or migration event. Populations can preserve strong autosomal continuity with earlier steppe ancestors even while dominant paternal lineages shift through founder effects, elite dominance, clan expansion, or breeding bottlenecks. The fact that classic Yamnaya males were not R1a-M417 or R1b-P312 therefore does not disprove the broader steppe migration model. It simply suggests that multiple branches of steppe derived populations expanded in different directions and that currently sampled Yamnaya groups do not represent the full diversity of the wider steppe world ancestral to later Europeans.

The mainstream archaeogenetic position today is that there were major steppe derived migrations into Europe during the 3rd millennium BCE with enormous demographic impact, but Corded Ware and Bell Beaker were not simply identical copies of Yamnaya society. The relationship is best understood as descent from Yamnaya-like or steppe related ancestral populations rather than literal continuity in every genetic, cultural, or paternal lineage detail.​

It’s not that I deny steppe migrations per se — what I’m specifically rejecting are Yamnaya migrations as they are commonly portrayed. The problem is that an oversimplified narrative has spread across the internet, and it’s becoming increasingly inaccurate over time.


A story was built around a very specific set of data: the Yamnaya were 70–90% M269>, the Bell Beakers were 70–90% M269>, and they shared roughly 30% similar autosomal ancestry on average. Back in 2018, it was completely coherent and rational to think they were sister populations. That idea became hugely popular and massively boosted DNA test sales, especially in Northern, Central, and Eastern Europe.

Despite the current North/South sampling bias, southern P312> populations (Portugal, Spain, France, Italy) consistently show older average TMRCAs across most clades than northern populations (England, Germany, the Netherlands, Belgium, Austria, Switzerland, Czechia).

The problem is that no P310>, L151>, or P312> has ever been found in the steppes. The closest thing to that lineage is an L51 sample dated to 2800 BC, while L151> itself dates to around 3200 BC — a 400-year gap.

If the western branch really came from that region, it must have migrated before the Yamnaya even existed, sometime around or before 3300 BC.

M269 predates 4500 BC, and the first empirical M269>L23>Z2103 sample dates to around 3800 BC near the North Caucasus. There is also a clear phylogenetic split between the Z2103 clades found in Armenia/Turkey and those from the steppes, meaning we cannot even say with certainty whether M269 originated in the steppes and moved into the Caucasus, or originated in the Caucasus and later spread into the steppes.

So in reality, the most precise statement we can make is that this haplogroup originated somewhere around the Black Sea region — not specifically in the steppes, nor culturally within Yamnaya.

Eight years later, the general summary no longer fits any of its original pillars, and yet the narrative continues rolling forward with the same inertia. The Yamnaya were not responsible for the origin of the Indo-European language, horses, or bronze.

Many people in Northern Europe argue that the earliest L151> samples are found in the Corded Ware Culture of Bohemia, citing Papac et al., but those samples date to around 2800 BC. Some already carry the U106 mutation, and personally I’d say all of them probably do. They are not the direct ancestors of P312*, because they do not share the same STR profiles, and P312* had already existed for at least 200 years by then, even if no equally old aDNA samples have yet been found.

The North can plausibly defend an origin for L51>L151>U106, but not for P312*. In Bell Beaker culture, the elites were overwhelmingly P312>, not U106*. U106 has actually been found very rarely compared to P312 within Bell Beaker contexts.

The archaeological Bell Beaker expansion follows this sequence:
2900 BC Portugal → 2700 BC Spain → 2600 BC France/Italy → 2500 BC Rhine region → 2400 BC British Isles/Hungary.

The oldest P312* samples currently known date to around 2600 BC. We can move radiocarbon dates slightly one way or another, but both major branches are effectively equally ancient because both diverged from P312>ZZ11 roughly 200 years earlier.

~2600 BC Central Europe: P312>ZZ11>U152>L2
~2600 BC Iberia: P312>ZZ11>DF27>ZZ12_1

Their shared male ancestor would be ZZ11 about 200 years earlier. They barely even share IBD segments with each other, so you cannot claim that either branch is older than the other — they are parallel lineages.

They are part of the origin, not the origin itself.

These are the two most successful early Bell Beaker branches, each with more than 30 consolidated immediate lineages.

The founder effect of P312>L21>DF13 is largely irrelevant because it only truly dominated the British Isles and is 100–200 years younger than its older sibling branches. DF13 could indeed be native to the Rhine region or French Brittany.

Are the oldest U152> branches currently found in the North?
No — and they barely reach 5–10% frequencies there.

The oldest U152* branches are found in Italy:
L2, Z36, Z56, PF6585.
Frequencies of at least 15% across the country, peaking above 50% in some northern regions.

Are the oldest DF27> branches currently found in the North?
No — and they also barely reach 5–10%.

The oldest branches are in the Iberian Peninsula:
Z195, ZZ12_1, Z198, Z274, FTT1, FTT3.
Frequencies of at least 40% in almost every region, peaking around 70% among Basques.

It is therefore unlikely that P312* originated in the North, since the clades mentioned above now account for over 70% of all P312> lineages.

Bell Beaker culture emerged in Portugal around 2900 BC.

It is very unlikely that significantly older P312* aDNA samples from 2900–2600 BC will suddenly appear elsewhere, because during that period these lineages were still emerging. Even after Akbari added 15,000 new samples, nothing older than the existing ~2600 BC samples has been found.

So the overall picture is unlikely to change for many years, and the oldest documented P312* samples will probably continue to be EHU002 from Burgos and the U152>L2 samples from Central Europe.

To explain the modern genetic landscape of Iberia, there is no need for Yamnaya invasions, Central European Bronze Age invasions, or Celtic Iron Age invasions.

DF27 is statistically the elder brother of U152 and the oldest clade among all P312>, and this simply cannot be explained through a northern origin model.


Returning to horses: a specimen carrying haplogroup P0* (the only documented example of that lineage outside Iberia) was found near the Nordic region and dated to 3000 BC. In the opposite direction, no horses originating from the North have been found in Iberia.

Then there’s the case of D* haplogroup horses in southwestern Iberia, which diverged from steppe horses around 4000–3500 BC.

What I pointed out in the summary is that humans domesticated horses through multiple independent processes. But selecting animals to create a bulldog (for meat) is not the same as breeding racing greyhounds.

The three major branches of haplogroup D> diverged long before the Sintashta spread their specific lines around 2000 BC. Yet modern horses do not descend directly from those Sintashta lines when examined in detail. So they may have contributed to domestication, but they were neither the beginning nor the end of the process.

The fact that all three branches eventually became 100% DOM2 autosomally through selective breeding cannot be random. It strongly suggests that at least three distinct populations were continuously involved in horse domestication from the beginning.

The problem with that article is that it tries to redefine domestication by mixing bulldogs with racing greyhounds, while ignoring the fact that there are actually two bottlenecks, not one:
the first between 4000–3000 BC, and the second around 500 BC.

The steppe horses of 2000 BC are something intermediate, but they do not explain the entirety of horse domestication, nor are they the starting point. To justify that claim, researchers relied on distinctive genetic markers linked to anxiety and spinal traits, but those lineages were not ultimately the ones that won out. In other words, the horse type we are truly looking for did not really exist before 1000–500 BC.

In Iberia we have P0*, D*, DB*, DA*, DA1*, and DAC* — many distinct lineages — yet ultimately only one became dominant. That process took at least 3,000 years for a single line to become globally dominant.

Three thousand years leaves plenty of room for change, and the sampling bias is enormous. The next paper could radically alter the picture with just 20 additional samples. But what can no longer change is the fact that western-origin DAC lineages eventually replaced the eastern ones.

I think it will take several more years before we truly understand which human haplogroups inhabited which regions during each period, and what kinds of horses each group possessed.

M269> appears in every scenario involving horse expansion across different periods, but Z2103> does not.

M269 is not an ordinary haplogroup. It is the macro-haplogroup with the greatest reproductive success in human history within just 6,000 years, potentially having more than 350 million male descendants alive today. That is why it is the most documented and studied haplogroup.

M269>PF7562 and M269>L23>L51>PF7589 also do not appear to be of steppe origin. So far they seem clearly Balkan/Anatolian in origin as well (around the Black Sea region).

The R haplogroups are not solely responsible for Indo-European languages — J2 was also involved. As more data emerges, the evidence increasingly points toward an Anatolian/Caucasian origin rather than a purely steppe one.

So where did P310>L151 really come from?

Probably somewhere in the Balkans/Black Sea region between 3500–3000 BC.

I’m not proposing a radically different origin — only a temporal shift roughly 500 years earlier than currently estimated. But that changes the expansion process dramatically. We would no longer be talking about a massive migration involving many unrelated clans, but rather a founder effect of P312> already established within Atlantic/Mediterranean Europe.

The geographic origin itself does not change, and this also explains the arrival of CHG ancestry in southern Bell Beaker zones. EHG ancestry was already present at around 10–15% among Chalcolithic Iberian I2 groups, and those I2 lineages neither disappeared nor migrated elsewhere (in fact, I’m looking at one right now). P312> lineages simply achieved greater reproductive success after the Bell Beaker period, and that mixture is precisely why modern Iberians remain among the closest populations to the ancient WHG cluster.

The lineages that truly disappeared after the Neolithic were R1B-V88 and H. But the French paper by Buri seems to suggest this had more to do with climate and disease than with the arrival of P312>, and not with any conquest-related massacre.

It is essential to focus strictly on the development of Y-chromosome lineages in both humans and animals, because only the Y chromosome clearly traces the distinct path of each group.

For example:

If CWC populations were “Yamnaya” simply because they cluster autosomally, does that mean the Etruscans were Iberians?

We all know the common factor in both cases is P312>ZZ11, but U152 and DF27 diverged between 2800–2600 BC. What makes them autosomally close is first the Cardial culture and second the Bell Beaker culture.

The Italics do not descend from Iberians, nor Iberians from Italics, yet you can model a modern Balearic individual as 50% Etruscan and an ancient Etruscan as 50% modern Catalan.

The only reason we know they are not recently related is because of the Y chromosome. Without it, people would be speculating wildly with anachronistic interpretations.

There are many people using G25 estimates without understanding how qpAdm actually works, and that is a serious problem.

An even bigger problem is that some people are using qpAdm while presenting models with p-values of 0.9 in aDNA contexts — when achieving 0.9 is usually a sign that something is being overfitted because it is unrealistically precise — while the source Z-scores often do not even exceed 3.

In the end, they are only partially “reading” what they are analyzing, and in most cases doing so anachronistically.

Modeling a single person with ancient samples is relatively easy because of chance: sometimes there simply happen to be ancient populations genetically close to you, and the results look excellent.
The real challenge is explaining entire populations.

If the person doing the modeling does not understand the deep phylogenetic and historical background of each population source, they are basically fooling themselves until statistics eventually show them what they wanted to see from the beginning.

The correct order of importance for investigating the origin of male groups is:
  1. Y chromosome SNPs
  2. STR profiles to differentiate unclassified or extinct SNPs
  3. IBD relationships
  4. Autosomal modeling with qpAdm, sample by sample
Ordinary people prioritize G25 autosomal modeling instead.

There are many studies with terribly flawed models that are still used because, even if they are not fully accurate, nothing better can currently be achieved due to the lack of intermediate samples. They are still necessary to analyze the available data.

The real problem comes when some people extrapolate that the more “steppe” ancestry you have, the more “Yamnaya” you are — and suddenly they imagine themselves riding a horse with an irresistible urge to conquer Western Europe.

Between P310>L151>P312 there are 500–1000 years that remain unknown, and researchers have tried to fill that gap with the M269>Z2103>KMS67 lineages from the steppes — but the theory simply does not fit together the way they expected.
 
Back
Top