I got a copy of the article, but it is very disappointing. All I wanted was a frequency table, hopefully divided by regions, but there isn't even one with the total of the 1200 samples. Fortunately there is a list of individuals number sorted by haplogroups, so I can get calculate the frequencies myself. As if it wasn't inconvenient enough, they didn't care to mention the SNP's next to the subclade denomination. I'll have to check in the Supplementary Materials.
Haplogroups |
Number of samples |
Percentage |
A1b1b2b |
6 |
0.5% |
E1a1 |
6 |
0.5% |
E1b1b1a1 |
32 |
2.6% |
E1b1b1b1 |
70 |
5.8% |
E1b1b1b2 |
24 |
2% |
F3 |
7 |
0.6% |
G2a2b |
40 |
3.3% |
G2a3 |
91 |
7.6% |
I1a3a2 |
2 |
0.16% |
I2a1a |
465 |
38.75% |
I2a1b |
2 |
0.16% |
I2a2a |
10 |
0.8% |
I2c |
10 |
0.8% |
J1c |
63 |
5.25% |
J2a |
74 |
6.1% |
J2b |
23 |
2% |
L |
7 |
0.6% |
T |
27 |
2.25% |
Q1a3c |
1 |
0.01% |
R1a1a1 |
15 |
1.25% |
R1b1a2 |
185 |
15.4% |
R1b1c |
29 |
2.4% |
R2a1 |
10 |
0.8% |
Too bad those figures are useless for my purposes. They don't mention the subclades of R1a1a or R1b1a2. There is the subclade of I1 but it was already available in the free chart. Actually, even after checking the supplementary materials I am still not sure what SNP defines their I1a3a2, which isn't listed by ISOGG. It might be a new subclade of I1a3a (L1237).
---
UPDATE: I checked the SNP list in the supplementary data and found a dozen
R1b-U152. The SNP's matching known subclades are
L20 under L2,
Z56, and
Z144 under Z56, so these could be the Roman ones I was looking for. Of course it's always possible that all Italic and Alpine Celtic people possessed all subclades of U152, in which case my quest is a deadend. Anyway most of the U152 in Sardinia is U152*.
Among the non-U152 R1b1a2 there are four
L23, a few
DF27 (Z195, S230 and Z216), one
L21 (DF1/L513), one
L11, and one
U106 (Z381 > Z301 > L47 > Z9). The last one is surely the Vandal R1b I was looking for. The L11 could also be Scandinavian.
Two subclades of I2a2a (M223) were identified:
L701>L699 and
L1228.
The R1a1a individuals have the following subclades Z93>Z94 (Phoenician), Z280 (Balto-Slavic ?), Z282 (European), M458>L1029 (Central European ?). No typically Germanic subclade. I wonder how the Slavic looking R1a got there. The Goths ? There are also a couple of samples of I2a1b (M423) to support that.
---
I am also surprised at the great discrepancy between this study of Sardinians and the total of the previous studies I compiled, which had nearly 1100 samples. Previous studies had 13% of G2a against 10.9% here, 8.5% of E1b against 11% here, 2.5% of J1 against 5.2% here, 9.5% of J2 against 8% here, 1% of L+T against 2.8% here, 1% of Q against 0.01% here, 1% if I1 against 0.01% here... That's quite a lot of differences for two huge studies of a relatively sparsely populated island (1.6 million inhabitants).
If the frequencies can vary by 2 or 3% for most haplogroups between two big studies, how much confidence can we have in regional data with less than 500 samples, let alone less than 100 samples ? Yet, as far as Italy is concerned, only Sardinia, Sicily and Trentino-South Tyrol have over 500 samples if we combine all studies to date. Not even Tuscany ! Yet Italy is one of the best studied countries in the world for Y-DNA. That makes me wonder about the accuracy of the present data for every country. There could be huge changes to come, in the order of 10 to 20% for a single haplogroup in poorly sampled countries.