Family Tree DNA has updated their time to most recent common ancestor (TMRCA) age estimates with improved algorithms, resulting in more accurate estimates for customers to better understand their family history and potentially make new connections with previously unknown relatives.
If you are serious about genealogy research using Y-DNA, you might have heard about the new FamilyTreeDNA Discover™ tool. You might have also heard about the first beta release of the TMRCA (Time to Most Recent Common Ancestor) estimates for Big Y. These age estimates are important for genealogists because they put a time perspective on the Tree of Humankind.
Big Y-700 is a powerful tool for genealogy because it connects all human genetic males into a large family tree. If you belong to the same branch of the tree as someone else, you share an ancestor on both your direct paternal lines. That often makes it easy to identify the most recent common ancestor and it can support or disprove genealogies.
But sooner or later, you will reach a point where historical records are no longer available. Or, maybe you are just curious to know how far back you can trace your lineage. This is where the age estimates come in and can point you in the right direction!
Feedback about Age Estimates
Discover was released just two months ago, and we have been inundated with positive feedback from users. Some have posted pictures of their ancestors’ tombstones with birth years matching closely with their TMRCA estimate. But others have commented that some estimates are different from what they expected based on genealogical, or archaeological (ancient DNA) evidence.
We took that feedback and went back to work to improve the TMRCA estimates. Thanks to Dr. Paul Maier and the R&D team, we present the second beta release of the TMRCA estimates. This update addresses, in particular, feedback about some estimates being younger than expected. We are happy to announce that the first major update to the TMRCA algorithm is now available in FamilyTreeDNA Discover!
A New Way to Tackle Tree Paradoxes
This update introduces a new way to tackle tree paradoxes. These paradoxes occur when some of the tree stems do not add up to the same length. This can result in time inconsistencies, like a child haplogroup estimated to be born before its parent haplogroup.
Strict Clock
Most models for age estimates that we see in the genetic genealogy world are based on a Strict Clock assumption. These models assume that every lineage in the tree accumulated mutations (SNPs) at roughly the same rate. When this is not the case, the resulting inconsistencies can be left unresolved or adjusted using various statistical models.
But the strict clock assumption is not always appropriate for the human Y chromosome. Looking at the Y-DNA Tree of Humankind from both the macro and micro scales, we can observe differences in stem lengths. From there, we see that some lineages have a much longer list of mutations than others. If you have used the Block Tree, you may have seen these differences in stem lengths (“SNP height”) for yourself.
Relaxed Clock
Differences in the rate of accumulated mutations across a tree are well known in phylogenetics. These outliers can often be attributed to rapid changes in population size or environmental factors. This is where Relaxed Clock models come in. Relaxed Clock models are aware of the possibility of rate differences in the tree. They resolve inconsistencies by comparing stems along the whole tree and adjusting those that do not fit.
With that being said, we added a relaxed clock step to our TMRCA pipeline. We’ve seen great improvements for both our modern DNA (known genealogies) and aDNA (carbon dating) validation sets. The new beta estimates are already live on the Discover™ site for you to see for yourself!
Examples of Age Estimates
You may still be a little curious how this works. We have provided two examples of changes from the first TMRCA beta release. One famous historical example based on genealogy information. The other is based on ancient DNA and radiocarbon dating.
Please note that the TMRCA estimates are based purely on genetic data and self-reported birth years of present-day FamilyTreeDNA customers. They have not been adjusted or calibrated to fit with any other data.
Famous Historical Genealogy Example from Scotland
Sir John Stewart (of Bonkyll) was a Scottish knight and son of Alexander, the 4th High Steward of Scotland. Sir John’s exact birth year is not known, but it has been estimated to be about 1246. He died on July 22, 1298, while serving as a military commander together with Sir William Wallace at the Battle of Falkirk.
We know, from genealogy and DNA testing, of his immediate relatives, that he is the most recent common ancestor (MRCA) of Y-DNA haplogroup R-S781. The old historical date and genealogical precision make S781 a great test case for the TMRCA algorithm. It turns out that a rate shift on the tree makes his TMRCA harder to estimate. Let’s take a look.
Here is the previous Haplogroup Story for R-S781 (accessed August 30, 2022):
“Haplogroup R-S781 represents a man who is estimated to have been born around 550 years ago, plus or minus 150 years. That corresponds to about 1500 CE with a 95% probability he was born between 1331 and 1578 CE.”
Now wait a minute! Sir John died in 1298. But this paragraph suggests he was born at least 33 years later, and more likely not until 1500. Let’s see how this changed with the update.
“Haplogroup R-S781 represents a man who is estimated to have been born around 800 years ago, plus or minus 200 years. That corresponds to about 1250 CE with a 95% probability he was born between 1038 and 1393 CE.”
Following the update, the TMRCA estimate is now well centered around the expected birth year of Sir John Stewart.
An Ancient DNA Example from the Corded Ware Culture in Bohemia
Another interesting example is the ancient DNA sample PNL001 (Plotiště nad Labem 1). The sample is from a man associated with the Corded Ware culture. He died at age 25-30 in present-day Bohemia, Czech Republic. His remains were DNA tested and directly carbon dated to between 2914 and 2879 BCE (about 2900 BCE) with 95% confidence (Papac et al., 2021).
Genetic analysis shows that he belongs to haplogroup R-U106, so this can suggest a lower boundary for the haplogroup TMRCA. If PNL001 is a descendant of U106, then U106 must have lived before PNL001.
Here is the previous Haplogroup Story for R-U106 (accessed August 30, 2022):
“Haplogroup R-U106 represents a man who is estimated to have been born around 4,500 years ago, plus or minus 600 years. That corresponds to about 2400 BCE with a 95% probability he was born between 3044 and 1880 BCE
The PNL001 carbon date of ca 2900 BCE is within the broad range TMRCA estimate for R-U106. But the “most likely” estimate was about 500 years younger than the oldest value expected from the carbon dating. The estimate is not well centered. Let’s see how this changed with the update.
Haplogroup R-U106 represents a man who is estimated to have been born around 4,950 years ago, plus or minus 700 years. That corresponds to about 2900 BCE with a 95% probability he was born between 3619 and 2297 BCE>.”
We can’t know for sure if the PNL001 radiocarbon date is exactly right or when the U106 man truly was born. But with the new TMRCA algorithm, the new estimate is now better centered to align with external evidence.
The Future Of FamilyTreeDNA Age Estimates
We are very excited to share our updated (Beta 2) release of our Big Y age estimates!
You can help us improve the estimates by:
- Specifying birth years on your Big Y kits
- Documenting your patrilineal genealogy in your family tree with accurate names and birth years
- Linking Y-DNA matches with whom you share a known most recent common ancestor
This information is used for our validations and to calibrate the tree. Our age estimates will continue to change as new customers test with Big Y-700 and we improve the algorithm.
Very Interesting. Thank-you for keeping us up to date with the blog posts! Keep up the good work.
STR-based dendrograms and similar age estimations could be a helpful, and almost independent, tool that could be correlated with the SNP-based estimation.
Descendants of Hugh de Berkeley, born 1223-1227 AD, through three sons, occupy a section of the Y Haplotree under I-FT409425. Third son Walter’s descendants include the Towie Barclays, associated with Towie Barclay Castle in Aberdeenshire, Scotland, and including the Barclay de Tolly family, a member of which was Prince Michael Andreas Barclay de Tolly 1761 – 1818, an Imperial Russian Field Marshal and Minister of War during Napoleon’s invasion in 1812 and also the Governor-General of Finland.
The Block Tree currently does not sit too well with the STR-derived equivalent. Two of the three Towie Haplogroups are I-FTB41127, which includes Towie Clan chief Peter Barclay, Michael Barclay de Tolly, and Alexander Barclay; and downstream of it, I-FTB41093, comprising Nicholas Browne, Kenneth Rose and James V Barkley. Edward Rose’s upgrade result is in the pipeline.
I am unconvinced that all variants assigned to Hugh’s descendants on the Block tree have in fact been actually determined for those men, including the Kilbirnie and Perceton Barclays from two other sons; and Barclay Project administrator Tim Barclay is unconvinced by this Towie grouping, questioning “the reliability of these Block Trees when comparing these results to the problems I see in certain sections of the Barclay Block Tree versus the STR results. I am really hoping Edward Rose’s results may help clarify things for the Towie block when they arrive because I simply cannot reconcile the current Block Tree with the historical record, but the STR results do match the records much better. I have begun to wonder if the Block Tree loses its reliability when there is simply one tester available for each of the more recent branches (with MRCA in the last five hundred years)? ” One example from Tim: “something is very wrong with the Towie Block Tree – Alexander has been placed in the same group as Michael and Peter despite not sharing the DYS389 variations they have, as opposed to Ken who does have these STR variations but is not in the same group.”
The relative genetic distances among these men don’t align too well with the Block Tree groupings either, nor with the number of non-matched variants in the Big Y match listings.
Tim noticed a few days ago that “Ed Rose’s 111 results are up on the Barclay page and show he and Ken sharing DYS635-21 otherwise unique in the Towie group, so I think it is safe to assign this as a value descending to both from their common ancestor. Ed also doesn’t have the DYS576-18 value that Ken and Michael Barclay de Tolly share so it seems this must have arisen in Michael and Ken’s lineage separately.”
What happens to the Block Tree will be of considerable interest, but whatever the result, if there are algorithm-derived flaws in the Block Tree, STR offers a powerful way of detecting them.
In the Barclay case, there is a significant wealth of traditional genealogical history available for some of the testers and ancestral lines. Finding a way of including solid chronological mileposts involving MRCPAs would improve your age estimations also, though they will rarely be available.
Here is a note from the researchers in response:
Currently there are 6 men who are I-FTB41127. Of these men 3 of them can be grouped together via SNPs. Ken, James and Nicholas have 3 mutations that Alexander, Peter and Michael lack. Those are FTB41093, FTB49978 and an INDEL that has not been placed on the tree (8034876 AG to T). This is unambiguous, concrete evidence that Ken, James and Nicholas share a unique common ancestor that Alexander, Peter and Michael do not.
STR marker alleles can change up and down and mutate much faster than SNPs. SNPs are much more stable in that they are binary – When a SNP mutation has happened it almost never mutates back to the original state. Even one back mutation would be extremely unlikely and three are virtually impossible.
It is important to note that the three men currently placed at I-FTB41127 (FTB41093-) are NOT a unique group more closely related to each other than they are to the FTB41093+ men. If they were, then there would be a branch placed on the tree grouping them and excluding the FTB41093+ men. It is entirely possible that Alexander is more closely related to the FTB41093 men than he is to either Peter or Michael, this would not be in conflict with the block tree.
Perhaps the new Time Tree can help clarify the meaning of the tree structure. It shows how all 6 men share a paternal line ancestor who lived sometime around year 1370 and 3 of the men further share a slightly more recent common ancestor. You can find the Time Tree option on the Discover report and we will publish another blog article about it soon.
It is good to learn that linking is possible for Y and mt matches. I built a tree with male descendants of a common ancestor from the 1600s, and linked a 6th cousin 1x removed and a 7th cousin 1x removed. The tree management tab now shows the Y matches as linked with an unknown relationship:
“Unknown relationships fall outside linking thresholds, and we are unable to determine an exact relationship.”
Hopefully these data still assist the algorithm.
Based on the STRs of fellow testers I think there must be a more recent Haplotype for me than the current one allocated to me of R-YP1390. Through historical records, I’ve traced my direct patrilineal ancestor back to circa 1575 in Buskerud, Norway. Can anyone help me on this? I’m hoping to get a TMRCA within this 447 year timespan. It’s currently about 800 years.
Hi Al!
Unfortunately there is no closer match to you at this time, but as more people test there will hopefully be a closer match.
One option is to reach out to your Y-STR matches and encourage them to upgrade to Big Y.
I have an extensive tree here, uploaded from a GEDCOM, and 4 YDNA matches (8th and 9th cousins, one 9th cousin a Big Y-700 match, the 8th cousin a Y-67 match, and the other two 9th cousins being Y-37 matches, the non-Big Y matches not having tested further. Another match, Big Y-500, I do not know the exact relationship as there is a brick wall on his paper trail.) with whom I know the relationships, but when I try to navigate back to the common relatives, the tree crashes. Is it worth it for me and FTDNA to make a new, less extensive tree that will be easier to navigate and link these matches?
Hi Richard!
This is a good question. We have people in our customer service department that would be able to help you make that decision. You can reach out to them by visiting https://www.familytreedna.com/contact
I (kit#785129)am trying to understand. I have a 7th cousin that the Big Y700 says we share an ancestor(R-FT395530) born around 1450 CE. It should be around 1700 CE. Now, the range it shows is born between 1095 and 1724 CE. The 1724 CE is closest to where it should be but the median is 1450 CE which is way off. Now my 4th cousin just got his results and hasn’t gotten the manual review yet, but currently is showing no breakup of SNPs just that we all three show the same MRCA even though he and I share (100% for sure) MRCA that was born 1786 all documented. Hopefully, after the manual review the 8 SNP’s in the R-FT395530 should be broken up and a more recent ancestor for us. I don’t know too much about how this all works, but am trying to get my brain around it. Any thoughts would be appreciated. Thanks
I also have a family group with Y-DNA tests that we believe go back to a MRCA born 1664. The first release of Discover in June, 2022 estimated the year of our MRCA to be 1675 (1497-1803 95% CI), which was highly corroborative and cause for research celebration. However, the new version of the algorithm (October, 2022) has moved the year back to 1442 (1168-1643 95% CI) and caused quite a bit of consternation given other evidence we have that we’d found the right MRCA in the late 1600s. Our theory regarding the identity of our MRCA may be incorrect, but this shift from a corroborative 1675 to 1442 is perplexing.