Wednesday, September 24, 2014

DNA Mysteries: Iberian R1b-V88 in Africa

   When I first heard about R1b in Africa, my immediate assumption was that the predominantly Celtic haplogroup must have been a recent transplant.  I ran some of the V88 haplotypes against the big databases (FTDNA & ySearch) expecting to see matches to European men within the African colonial timeframe.  It wasn’t that easy.  Common ancestor analysis put the R1b Africans (V88) thousands of years removed from the rest of their European R1b cousins.  Where did they come from?  How did they get there?

   I started with the given that the R1b defining mutations (SNPs) occurred in the Iberian Peninsula.  The jury is still out on this hypothesis.  There have been scientific papers for and against Iberian origins of R1b.  My own work (Iberian Origins of R1b) supports an origin prior to the Neolithic expansion.  Could V88 have made a straight-line migration from Iberia to the Lake Chad region of Africa?  Could V88 have crossed the Straits of Gibraltar, travelled across the Sahara, which 7,000 years ago was a savannah well populated with animals for hunting, and arrived at Lake Mega-Chad?  That was my early premise.  I was wrong.

   The distribution of V88 is much larger than any of the scientific papers would indicate.  While I agree with the work that’s been done correlating the spread of V88 with the spread of Chadic languages (Cruciani et al 2010), the Chadic population is only a subset.  Nobody takes into consideration the V88 populations in Europe and the Middle East.  If they do, it is a sideways glance to say were ignoring them because they don’t fit into what we are trying to prove.  If you don’t look at the entire picture, your conclusions will be skewed.

   I wanted the largest selection of V88 Y-DNA records with at least 37 markers tested.  I started with Family Tree DNA projects that had the records SNP tested.  Those haplotypes were run against the ySearch database to identify highly related records with no SNP testing.  The initial gathering of records picked up individuals with SNP M73.  These were removed.  The key differentiator between V88 and M73 was DYS464a&b.  V88 was typically 12,12 and M73 was 15,15.  Thirty-seven or more STR markers are helpful in identifying additional related haplotypes and even more necessary in determining the relationship between records.  Most studies only looks at SNPs or a small handful of STR markers.  This is shortsighted.  Imagine a reference population of 100 records all with the same SNP.  Without enough STR markers you can’t tell whether you are looking at one haplotype with minor 1 or 2 step variations or 100 unique haplotypes.  That’s the difference between a founder event starting with as few as one individual or a group with greater diversity and age.

   My final set of 119 records has at least 37 STR markers, V88 SNP testing or is highly related via STR and has the geographic location of the most distant known ancestor.  The records are processed through PHYLIP to generate a phylogenetic tree.  The phylogenetic tree give a visual depiction of the relationships in the dataset and an approximate number of years back to common ancestors, represented as the nodes between the records.

All of this is very standard genetic genealogy.  I add a twist (Biogeographical Multilateration) by converting the years back to a common ancestor to a distance using Cavalli-Sforza’s migration rate of 1 to 1.2 km per year.  This is enough for me to solve a series of cascading equations giving me the locations of the common ancestors.  Looking back at the phylogenetic tree shows us how all the nodes and locations are connected, essentially the flow of migration.

   The out of Iberia event took place about 7,700 ± 1,600 years ago.  TMRCA calculations have been shown to be very inconsistent.  Some folks use a constant mutation rate and some use rates per marker.  I include a TMRCA to give a relative chronology.  While the majority of R1b is known for its Western Atlantic migrations, V88 took a path along the Mediterranean coast and down the Adriatic.  While none of the V88 records indicated Crete as an ancestral location, it appears multiple times as a common ancestor location.  The data shows Crete as a stepping-stone in the Mediterranean as V88 migrated to the Nile River Valley.  The back to Africa event(s) occurred roughly 5,500 ± 1,000 years ago.

The majority of the Chadic records (Cameroon, Chad and Nigeria) have relatively close genetic connections to individuals in the Middle East (mainly Saudi Arabia).  The Chadic and Middle Eastern records tie back to common ancestors along the upper Nile.  There is a significant lack of information to understand what impact R1b-V88 had on the Nile Valley cultures.  Considering that there was only 1 out of 119 records with an exact Nile River location, I would venture a guess that V88 didn’t integrate well.

   While the V88 back to Africa migration has captured much attention, the data shows a more fascinating event.  There was a V88 re-migration back to Europe from Africa.   The back to Europe event took place about 3,200 ± 1,000 years ago.  Again, Crete played a role as a stepping-stone as V88 entered the Eastern Adriatic region and spread into Central and Eastern Europe.  Someone will probably notice that many of the V88 in Eastern Europe are Jewish and that the date for leaving the Nile region is close to the time of Exodus.  There is nothing in any of the data to indicate that this was the Jewish Exodus from Egypt.  The V88 group in Eastern Europe is closely related and there is phylogenetic evidence to support that this may have been a founder event with a single male or small group of closely related males.  There is no evidence to support that those founders were Jewish when they left Africa.

   By looking at the big picture, including all the data and letting the data illustrate the patterns, we can unravel what appears to be the mysterious appearance of R1b in Central Africa.  Along the way, we can uncover a previously unknown re-migration from Africa to Europe.  Too often haplogroup data is treated as discrete buckets of information living in a vacuum with no interaction to other haplogroups and no internal relationships.  Every DNA record is connected to every other record in a network.  Each haplotype is a vector with location and direction.  The sooner we treat genetic records as a network analysis, the sooner we will solve more DNA mysteries.

Out of Iberia and back to Africa.  Followed by a return to Europe.


Maglio, MR (2014)  Y Chromosome Haplogroup R1b-V88: Biogeographical Evidence for an Iberian Origin (Link)