Wednesday, September 24, 2014

DNA Mysteries: Iberian R1b-V88 in Africa

   When I first heard about R1b in Africa, my immediate assumption was that the predominantly Celtic haplogroup must have been a recent transplant.  I ran some of the V88 haplotypes against the big databases (FTDNA & ySearch) expecting to see matches to European men within the African colonial timeframe.  It wasn’t that easy.  Common ancestor analysis put the R1b Africans (V88) thousands of years removed from the rest of their European R1b cousins.  Where did they come from?  How did they get there?


   I started with the given that the R1b defining mutations (SNPs) occurred in the Iberian Peninsula.  The jury is still out on this hypothesis.  There have been scientific papers for and against Iberian origins of R1b.  My own work (Iberian Origins of R1b) supports an origin prior to the Neolithic expansion.  Could V88 have made a straight-line migration from Iberia to the Lake Chad region of Africa?  Could V88 have crossed the Straits of Gibraltar, travelled across the Sahara, which 7,000 years ago was a savannah well populated with animals for hunting, and arrived at Lake Mega-Chad?  That was my early premise.  I was wrong.

   The distribution of V88 is much larger than any of the scientific papers would indicate.  While I agree with the work that’s been done correlating the spread of V88 with the spread of Chadic languages (Cruciani et al 2010), the Chadic population is only a subset.  Nobody takes into consideration the V88 populations in Europe and the Middle East.  If they do, it is a sideways glance to say were ignoring them because they don’t fit into what we are trying to prove.  If you don’t look at the entire picture, your conclusions will be skewed.

   I wanted the largest selection of V88 Y-DNA records with at least 37 markers tested.  I started with Family Tree DNA projects that had the records SNP tested.  Those haplotypes were run against the ySearch database to identify highly related records with no SNP testing.  The initial gathering of records picked up individuals with SNP M73.  These were removed.  The key differentiator between V88 and M73 was DYS464a&b.  V88 was typically 12,12 and M73 was 15,15.  Thirty-seven or more STR markers are helpful in identifying additional related haplotypes and even more necessary in determining the relationship between records.  Most studies only looks at SNPs or a small handful of STR markers.  This is shortsighted.  Imagine a reference population of 100 records all with the same SNP.  Without enough STR markers you can’t tell whether you are looking at one haplotype with minor 1 or 2 step variations or 100 unique haplotypes.  That’s the difference between a founder event starting with as few as one individual or a group with greater diversity and age.

   My final set of 119 records has at least 37 STR markers, V88 SNP testing or is highly related via STR and has the geographic location of the most distant known ancestor.  The records are processed through PHYLIP to generate a phylogenetic tree.  The phylogenetic tree give a visual depiction of the relationships in the dataset and an approximate number of years back to common ancestors, represented as the nodes between the records.


All of this is very standard genetic genealogy.  I add a twist (Biogeographical Multilateration) by converting the years back to a common ancestor to a distance using Cavalli-Sforza’s migration rate of 1 to 1.2 km per year.  This is enough for me to solve a series of cascading equations giving me the locations of the common ancestors.  Looking back at the phylogenetic tree shows us how all the nodes and locations are connected, essentially the flow of migration.


   The out of Iberia event took place about 7,700 ± 1,600 years ago.  TMRCA calculations have been shown to be very inconsistent.  Some folks use a constant mutation rate and some use rates per marker.  I include a TMRCA to give a relative chronology.  While the majority of R1b is known for its Western Atlantic migrations, V88 took a path along the Mediterranean coast and down the Adriatic.  While none of the V88 records indicated Crete as an ancestral location, it appears multiple times as a common ancestor location.  The data shows Crete as a stepping-stone in the Mediterranean as V88 migrated to the Nile River Valley.  The back to Africa event(s) occurred roughly 5,500 ± 1,000 years ago.


The majority of the Chadic records (Cameroon, Chad and Nigeria) have relatively close genetic connections to individuals in the Middle East (mainly Saudi Arabia).  The Chadic and Middle Eastern records tie back to common ancestors along the upper Nile.  There is a significant lack of information to understand what impact R1b-V88 had on the Nile Valley cultures.  Considering that there was only 1 out of 119 records with an exact Nile River location, I would venture a guess that V88 didn’t integrate well.

   While the V88 back to Africa migration has captured much attention, the data shows a more fascinating event.  There was a V88 re-migration back to Europe from Africa.   The back to Europe event took place about 3,200 ± 1,000 years ago.  Again, Crete played a role as a stepping-stone as V88 entered the Eastern Adriatic region and spread into Central and Eastern Europe.  Someone will probably notice that many of the V88 in Eastern Europe are Jewish and that the date for leaving the Nile region is close to the time of Exodus.  There is nothing in any of the data to indicate that this was the Jewish Exodus from Egypt.  The V88 group in Eastern Europe is closely related and there is phylogenetic evidence to support that this may have been a founder event with a single male or small group of closely related males.  There is no evidence to support that those founders were Jewish when they left Africa.


   By looking at the big picture, including all the data and letting the data illustrate the patterns, we can unravel what appears to be the mysterious appearance of R1b in Central Africa.  Along the way, we can uncover a previously unknown re-migration from Africa to Europe.  Too often haplogroup data is treated as discrete buckets of information living in a vacuum with no interaction to other haplogroups and no internal relationships.  Every DNA record is connected to every other record in a network.  Each haplotype is a vector with location and direction.  The sooner we treat genetic records as a network analysis, the sooner we will solve more DNA mysteries.

Out of Iberia and back to Africa.  Followed by a return to Europe.

Reference:

Maglio, MR (2014)  Y Chromosome Haplogroup R1b-V88: Biogeographical Evidence for an Iberian Origin (Link)


Tuesday, August 12, 2014

Iberian R1b Y-DNA: First Movers in Europe

   The disputed origins of haplogroup R1b, most commonly thought of as Celtic, remains split between Iberia prior to the end of the last ice age and various West Asian locations after the ice age.  A new view on the R1b homeland comes out every year.  With all we know about DNA, shouldn’t we be coming to a consensus?  Typically, I refer to R1b as Celtic to help an audience make the connection between lettered haplogroups and culture or ethnicity.  I also add the caveat that Celtic is a misleading label.   R1b is supergroup of cultures including; Iberian, Gallic, Celtic, Germanic and Scandinavian.  To attribute empires or nationalities to R1b would be foolish, as R1b is tens of thousands of years older than any known empire.

   Perhaps I’m naïve.  I like simple, logical answers.  The earliest publications on R1b described their ancestor R1, entering Europe from central Asia during a warm period about 30,000 – 40,000 years ago.  The last ice age forced R1 to split and take refuge south in Iberia and the Balkans.  Time and separation gave us the mutations R1b in Iberia and R1a in the Balkans.  That split is roughly what we see today in those regions.  That’s clean and simple.  The real world is much more complex.  R1b and R1a were not alone in Europe.  Their interactions with the other major European haplogroups- E, G, I, J and N has to be taken into consideration.  We can’t analyze R1b as if it were in a vacuum.

   Let’s take y-DNA haplogroups out of the picture for a moment.  We know that modern humans survived and flourished in the Iberian refuge during the end of the last ice age, based on mitochondrial DNA studies.  [Could someone please run some y-DNA tests on those samples?]  The tribes in western Europe, whoever they were, had a 1,000 to 2,500 year head start over the tribes in central and eastern Europe on repopulating the continent.  The ice sheets melted and retreated earlier on the west coast than in the rest of Europe.  This gave the inhabitants of the Iberian refuge an advantage – a “first-mover” advantage gained by being the first to move north.  These first-movers gained a land-monopoly.  A tribe with a first-mover advantage and over a 1,000 year head start should have been hard to displace from western Europe.  In other anthropological situations, those original inhabitants are forced into niche locations by invading populations, but very rarely are displaced completely.  What we see on the west coast of Europe, is a very strong R1b presence and no niche haplogroups of a significant age.  From this point of view, either R1b is the original Iberian inhabitant or R1b completely decimated another earlier haplogroup that had a 1,000 year geographical head start.  I like simple.  R1b was in Iberia first.
   Let’s throw some data at the problem.   The R1b haplogroup population is enormous.  The majority fall into SNPs R-P312 (Celto-Iberian) and R-U106 (Celto-Germanic).  There is so much information there that it tends to be noise.  If you want to get to the root of R1b (R-M343), you need to work with the branches that are closest to the root - R-L278*, R-V88, R-M73*, R-YSC0000072/PF6426 and R-L23.

• • R1b   M343
• • • R1b1   L278
• • • • R1b1a   P297
• • • • • R1b1a1   M73
• • • • • R1b1a2   M269
• • • • • • R1b1a2a   L23
• • • • R1b1c   V88
[• • • • • • • • • R1b1a2a1a1   U106 - too far downstream]
[• • • • • • • • • R1b1a2a1a2   P312 - too far downstream]

   I collected 250 records that matched these SNPs or were genetically close by STR haplotype.  These records were mapped based on user-reported most distant ancestor location.


   This is not a connect the dot exercise.  Just because two or more records appear geographically close doesn’t mean that they are genetically close.  These 250 records have to be treated like a network.  If this were Facebook, these folks would be randomly associated through family, business, school or neighbor connections.  These are y-DNA records.  There is a relationship between every pair.  Each pair has a different common ancestor, with a different number of generations to get back to that ancestor.  Here is an example of what that relationship looks like across multiple pairs.  The number represents years back to a common ancestor (TMRCA).


When all of the interrelations are taken into consideration, the group of records can be displayed as a relationship tree of who is older or younger and who is more closely related to whom (phylogenetic tree).


   Now we have who, where, when and how the records are connected.  At this point it does become a connect the dots exercise.  I’ve used a biogeographical analysis to connect very specific sets of dots based on the calculated interrelation of the entire group.


   The R1b genetic family tree has a trunk and many branches.   The trunk of the R1b data is firmly rooted in Iberia.  The main core of the tree stretches along the western Atlantic coast of Europe and branches across Europe and even back into Asia.  The results that I found support the work of the earliest pioneers in the field and conflict with the latest publications.
 


   Every analysis has its limitations.  The work that I’ve done looks back at the R1b family about 8,000 years.  The scarcity of data only allowed for me to predict the origin of R-L278, which is currently one branch below the main root of R-M343.    I can’t tell where R1b was between the times that R1 split into R1b and R1a, yet.

   In my analysis, I have included R-V88.  They are an curious group of R1b found in Africa and the Middle East.  I will be treating R-V88 in a separate write-up to do justice to a very interesting back migration story.

Reference:

Maglio, MR (2014)  Biogeographical Evidence for the Iberian Origins of R1b-L278 via Haplotype Aggregation (Link)

Thursday, June 26, 2014

Your Autosomal DNA Tapestry

Deep Into DNA*

   What does a tapestry have in common with your autosomal DNA?  A tapestry is a colorful and complex weaving that tells a story.  Your autosomal DNA is a complex weaving of 3 billion base pairs inherited from your ancestors.  Autosomal DNA can tell multiple stories about ethnicity, health and relationships.  As you will see, your DNA can be quite colorful.

Bayeux Tapestry (Source: Wikimedia Commons)
   Every year new tools become available to help us understand our genetic patterns and learn about the stories written in our genes.  There are stories of health issues, both good and bad.  There are stories of our cousin connections.  There is diverse color in our ethnic background.  My autosomal tapestry hangs proudly on the wall.

...continued at The In-Depth Genealogist with a free subscription.


*The Deep Into DNA article series is published each month in the new Going In-Depth
digital genealogy magazine presented by The In-Depth Genealogist.

Friday, May 16, 2014

DNA, SNP, STR, OMG!

Deep Into DNA*

   Oh my gosh, there are many acronyms in genetic genealogy.  You have to agree that using the acronym DNA is better than writing deoxyribonucleic acid repeatedly.  Although, when we talk about using DNA for genealogy and we only use acronyms, they start to lose their meaning and become just another ‘thing’.  “Hey, I’ve got a SNP.  Do you have a SNP?”  “I dunno, let me check.”  



   Maybe I’m weird.  I like to understand what all the acronyms mean and how they play a part in the larger picture.

...continued at The In-Depth Genealogist with a free subscription.


*The Deep Into DNA article series is published each month in the new Going In-Depth
digital genealogy magazine presented by The In-Depth Genealogist.

Thursday, May 1, 2014

TribeMapper Contest Winners

Congratulations to all our winners!


The winners are:

  • Michael Durkin
  • George Heubach
  • Sylvia Jackson
  • Paul Smith
  • Jennifer Zinck
Stay tuned as we unravel their history over the next weeks.

Thank you to everyone who entered.  

The TribeMapper Report is now on sale until June 1, 2014.  Details are on the OriginsDNA website.


Where did you come from?

Wednesday, April 30, 2014

Last Day for Entries: TribeMapper Report Give-Away

As part of the DNA Day celebration, we are giving away five (5) TribeMapper Reports.

Tonight, at midnight EST, the contest will be closed.  Tomorrow, May 1st, I will announce the winners.

TribeMapper for the House of Normandy
Haplogroup R-L11*
Haplogroup I-L22 Flow into British Isles
Haplogroup G-Z725

For more details on the content of the report see our website.

Contest Terms & Conditions:

You must have completed at least a 37 marker Y-DNA (paternal line) test.  The results of your Report can be used for research, as the basis for an article or for the promotion of OriginsDNA.com.  Your supplied DNA results will not be disclosed, sold or otherwise transferred.

To enter the contest, please send an email to TribeMapper@OriginsDNA.com.  In the email, provide the full name of the Y-DNA donor, haplogroup (if known) and your Y-DNA marker results.

Good Luck!

Tuesday, April 29, 2014

Exploring Rollo's Roots: DNA Leads the Way


   It’s been nearly a year since I wrote about William the Conqueror’s DNA.  Based on a study of men with surnames historically associated with William and their corresponding Y-DNA, I concluded that I identified the genetic signature of the first Norman King of England.  Now it’s time to get back to William and more specifically his 3rd great grandfather, Rollo.  To be honest, the 37 marker Y-DNA haplotype that I published is really connected to Richard the Fearless, William’s great grandfather.  Genealogically, the surnames in the study trace back to Richard.  As long as there was no hanky-panky, William the Conqueror has the same Y-DNA as Richard.  What that also means is that Richard has the same Y-DNA as his grandfather, Rollo.

   Based on the work done in my previous paper, the following haplotype is that of William the Conqueror (and Richard the Fearless)-


DYS393
DYS390
DYS19
DYS391
DYS385a
DYS385b
DYS426
DYS388
DYS439
DYS389i
DYS392
DYS389ii
13
24
14
11
11
14
12
12
12
13
13
29

DYS458
DYS459a
DYS459b
DYS455
DYS454
DYS447
DYS437
DYS448
DYS449
DYS464a
DYS464b
DYS464c
DYS464d
17
9
10
11
11
25
15
19
29
15
15
17
17

DYS460
Y-GATA-H4
YCAIIa
YCAIIb
DYS456
DYS607
DYS576
DYS570
CDYa
CDYb
DYS442
DYS438
11
11
19
23
15
15
17
17
36
37
12
12


   There is an assumption, inherent in genetic genealogy, that there weren’t any non-paternal events between the generations that separate Rollo and William and that this haplotype is that of Rollo as well.  One of the goals for this Rollo study is to get more accurate with his haplotype by narrowing the dataset to only those records with 67 markers.  The second goal is to determine Rollo’s haplogroup R SNP.  The best I was able to determine for William was R-P312, which is a fairly high level SNP.  My third goal is to determine Rollo’s origin using my TribeMapper analysis.  Whether Rollo is Danish or Norwegian has been disputed for hundreds of years.

   I picked up where I left off with William.  There were 152 Y-DNA records that made it into the William the Conqueror Modal Haplotype (WCMH).  For each of these records a 67 marker test result and SNP testing result were added to the analysis, where the data was available.  I threw out any record that didn’t have enough data and retained the ones that grouped into a single SNP of R-DF13 (just downstream of R-L21).  Based on these final 25 records, I have identified the 67 marker Rollo Norman Modal Haplotype (RNMH) as follows:

DYS393
DYS390
DYS19
DYS391
DYS385a
DYS385b
DYS426
DYS388
DYS439
DYS389i
DYS392
DYS389ii
13
24
14
11
11
14
12
12
12
13
13
29

DYS458
DYS459a
DYS459b
DYS455
DYS454
DYS447
DYS437
DYS448
DYS449
DYS464a
DYS464b
DYS464c
DYS464d
17
9
10
11
11
25
15
19
29
15
15
17
17

DYS460
Y-GATA-H4
YCAIIa
YCAIIb
DYS456
DYS607
DYS576
DYS570
CDYa
CDYb
DYS442
DYS438
11
11
19
23
15
15
17
17
36
37
12
12

DYS531
DYS578
DYF395S1a
DYF395S1b
DYS590
DYS537
DYS641
DYS472
DYF406S1
DYS511
DYS425
DYS413a
DYS413b
11
9
15
16
8
10
10
8
10
10
12
23
23

DYS557
DYS594
DYS436
DYS490
DYS534
DYS450
DYS444
DYS481
DYS520
DYS446
DYS617
DYS568
16
10
12
12
16
8
12
22
20
13
12
11

DYS487
DYS572
DYS640
DYS492
DYS565
13
11
11
12
12

Based on this modal haplotype and the associated SNP, a broader collection of genetic cousin records were identified to be used with my new TribeMapper analysis (Biogeographical Multilateration).




   This map shows the geographic distribution of Rollo’s cousins.  The large number of points along the coast of Normandy is a good sign.  If the majority of points were in Eastern Europe, I would have to revisit my whole hypothesis about William the Conqueror.  It is best not to try to interpret any relationships until we look at them through the lens of a phylogenetic tree.



   The TribeMapper analysis takes into consideration the mapped location, the tree node connections and the time between common ancestors.  The time is converted to distance based on the demic diffusion migration rate.  The distance is plotted to ‘triangulate’ the geographic location of each common ancestor.  This is a process called multilateration.

   The earliest documented origins for Rollo come from Dudo of Saint-Quentin in 1015 and William of Jumièges in 1060.  Both ‘histories’ were commissioned by the House of Normandy and attribute a Danish origin to Rollo.  Commissioned biographies can border on mythology.   The Norwegian Orkneyinga Saga, from the 13th century, gives Rollo a Norwegian origin. 

   I’ve run the analysis with Rollo’s record as an unknown location.  TribeMapper allows us to back into the location for any unknown point.  What we get is a highly constrained location for Rollo’s ancestor, in the middle of Denmark.  The data then shows that Rollo may have lived within 226 km of that paternal ancestor.  The red circle illustrates the range for Rollo.  This covers the majority of Denmark.  The data also shows that Rollo’s ancestors, going back at least 12 generations were also in Denmark.



   We can give the Norwegians some credit also.  The ancestors of Rollo’s ancestors were Nowegian, with an origin on the west coast of Norway.  Rollo’s ancestors were responsible for multiple branches of migration into Europe.  This includes a back migration into Norway that then went on to invade Scotland.



   This was accomplished with small sample of 65 records for simplification.  Much larger data sets could determine the genetic flow in a greater geographic and chronologic view.  Additional records within the same SNP grouping could result in a more accurate origin for Rollo.  Records that are genetically upstream from the SNP and STR group used, may identify the nomadic migrations prior to the Western Norway settlement.


   I’ve run this simulation multiple times, getting the same results.  I’m comfortable calling Rollo – “The Dane”.

Reference:

Maglio, MR (2014) Biogeographical Origins and Y-chromosome Signature for the House of Normandy  (Link)