Genetic Genealogy
6 p.m. Tuesday 23 January 2024 and Tuesday 30 January 2024 (via
Zoom)
WWW version:
Recordings:
23 January (TBA); 30 January (TBA)
Outline
Review of beginners' session
- Without recombination and mutations, all of us would have
identical DNA.
- "DNA" is a generic term covering four different components
with
- very different sources;
- very different inheritance paths; and
- very different genealogical uses.
- Sources:
|
male offspring |
female offspring |
sperm |
Y chromosome |
X chromosome |
22 paternal autosomal chromosomes |
egg |
X chromosome |
22 maternal autosomal chromosomes |
mitochondria |
DNA component |
Inheritance path |
Inherited by |
Y chromosome |
From father only (and only if male) |
males only |
autosomal chromosomes (autosomes) |
Equally from both parents |
everyone |
X chromosome(s) |
Unequally from both parents |
males x1, females x2 |
mitochondrial DNA |
From mother only |
everyone |
- Genealogical uses:
- autosomal DNA and Y-DNA are most often used in fishing
expeditions, trying to catch hitherto unknown relatives in
match lists;
- mitochondrial DNA and X-DNA are most often used for more
specialised hypothesis testing:
- the hypothesis that two women with the same maiden surname
from around the same place and around the same time are
sisters can be tested:
- by finding genealogical records identifying their
parents; or
- by finding a living matrilineal descendant of each and
comparing their mitochondrial DNA:
- if the mitochondrial DNA of the DNA subjects does not
match, then the matriarchs can not be sisters (or there
is an NPE);
- if the mitochondrial DNA of the DNA subjects does
match, then the matriarchs were sisters, or had a
slightly more distant common matrilineal ancestor than
the hypothesised common mother.
- the hypothesis that two men with the same surname from
around the same place and around the same time are brothers
can be tested:
- by finding genealogical records identifying their
parents; or
- by finding a living X-descendant of each (i.e.
descendants for whom the path to the patriarch does not go
through two consecutive male generations) and comparing
their X-DNA:
- if the DNA subjects share significant X-DNA, then it
probably comes from their hypothesised common mother
(unless there are two or more X-relationships);
- if the DNA subjects share no X-DNA, then the X-DNA of
the hypothesised common mother may have been lost to
recombination;
- the chances that the X-DNA of the hypothesised common
mother survived recombination are maximised if the paths
to both DNA subjects alternate between males and females
and the DNA subjects are from the earliest surviving
generation;
- the chances that the DNA subjects' X chromosomes are
half-identical by chance is eliminated if both are male.
The rest of this course will concentrate on the more widely used
autosomal DNA and Y-DNA.
- Autosomal DNA is the cheapest component to analyse and has
rapidly become the most widely used in genealogy:
- we measure similarity of autosomal DNA;
- matches with larger
shared centiMorgans (c.8cM-c.3600cM) are (on average)
more closely related;
- cousin matching using autosomal DNA will identify
relationships out to third cousin on all sides of your family,
and may identify more distant relationships;
- the technology cannot separate the paternal and maternal
autosomes, but since 2022 AncestryDNA has been separating most
matches into "Parent 1" and "Parent 2", without specifying
which is the father and which is the mother;
- identical twins, parent/child relationships and most
full-sibling relationships can be identified unambiguously;
- for more distant relationships, probabilities can be assigned to the
various possibilities;
- this tool shows the distribution of
possible relationships for a given shared centiMorgan value.
- Y-DNA has been widely used for one name studies or surname
projects for much longer, but has also seen rapid recent
scientific advances:
- we measure differences in Y-DNA;
- matches with smaller
genetic distance are (on average) more closely
related;
- cousin matching using Y-DNA will identify relationships with
men of the same surname and same genetic origin, but may
identify surname/DNA switches and/or more distant
relationships, predating the era of surnames (1014+);
- many surnames have multiple genetic origins, for example
occupational surnames (Smith, Miller, Potter, Cooper, etc.).
- The DNA companies will turn your spit or swabs into a data
file, which can be compared with data files from the DNA of
other individuals:
Identity v. Anonymity
The trade-off
- There is a trade-off between:
- increasing your chances of finding long-lost cousins and
ancestors (and being found by long-lost cousins); and
- privatising your family history research and DNA results.
- If you keep your DNA results or known family tree private,
then nobody will be able to find you and you will not be able to
find any DNA matches.
- If you want to be found, then you must let your potential
cousins see your birth surname, your DNA results and your known
deceased ancestors.
- If you give your matches no information, then they can not
help you.
- Some customers of the DNA companies appear to wish to maintain
a certain degree of privacy and anonymity.
- Others find it paradoxical that those trying to identify their
anonymous ancestors can be so concerned about anonymising their
own identity.
- FamilyTreeDNA.com (for customers who have not opted out) and
GEDmatch.com (for customers who have opted in) explicitly
allow familial comparisons of DNA recovered by law
enforcement from a crime scene or unidentified human
remains:
- to identify the perpetrator of a violent crime against
another individual; or
- to identify human remains.
- Using the MyHeritage DNA Services for law
enforcement purposes ... is currently "strictly prohibited,
unless a court order is obtained".
- AncestryDNA "do not voluntarily
cooperate with law enforcement".
- 23andMe's registration process assures customers that "We will
not provide information to law enforcement or regulatory
authorities unless required by law to comply with a valid court
order, subpoena, or search warrant for genetic or Personal
Information" and links to a Transparency Report.
- Collaborating by sharing match lists with known relatives and
with DNA matches often leads to breakthroughs:
- GEDmatch.com was designed for sharing match lists;
- FTDNA projects are designed for sharing match lists;
- AncestryDNA facilitates sharing match lists;
- even sharing MyHeritage passwords is a breach of terms and
conditions;
- most of the DNA websites have recently promised to force
some additional form of two-factor-authentication and/or
password changes on customers, which will inhibit
collaboration:
- the DNA websites have your permission to reveal information
about you to those whom they arbitrarily deem to be your
matches (including false positives), but are panicking
inexplicably about exactly the same information being revealed
to those whom they arbitrarily deem not to be your matches
(including false negatives).
Basic guidelines
The basic guidelines for successful use of the DNA websites include
the following:
- Reveal the DNA subject's birth surname:
- Most people inherit DNA with their birth surname, so identify
yourself as a minimum by your birth surname with an initial or a
title, e.g., P Waldron or Mr Waldron or Miss Durkan.
- Reveal the chromosomal gender of the person who provided the
DNA sample:
- A woman does not have a Y chromosome, so may ask a male
relative with the relevant surname to swab:
a father, brother, nephew, cousin, etc., if
her interest is in her maiden
surname; or
a husband, son, brother-in-law,
father-in-law, etc., if her interest is in her married surname.
Valuable additional inferences can potentially be drawn once it
is known whether two X chromosomes (female) or one X chromosome
and one Y chromosome (male) are potentially available for
comparison.
You must NOT attach a female name or a female's pedigree chart
to a male DNA sample (or vice
versa), as this causes untold confusion.
Be especially careful not to inadvertently link a male's Y-DNA
results with a female's autosomal DNA upload at
FamilyTreeDNA.com where error-checking does not look for this.
- Avoid providing irrelevant information:
- Your first name, married surname, adopted surname or marital
status reveal nothing about your DNA, so you may keep these
private if you wish.
- Use a photograph:
- If you upload a photograph (or any image) to your AncestryDNA
account before you receive your initial results, then the
photograph (hyperlinked to the match details) will appear on the
AncestryDNA Insights page of all your
matches for as long as you remain in their eight newest matches
with photographs.
- Avoid pseudonyms:
- They reduce the chances that your matches will bother to look
at your family tree, contact you or share the information about
your ancestry that they have and that you do not have.
- Be consistent and avoid unnecessary confusion:
- A real example (further anonymised):
- Ancestry username: tara1234
- AncestryDNA samples from mother and daughter (per email
exchange)
- linked to pedigree charts of an aunt and niece
- appear to matches as M.R. (managed by tara1234) and D.C.
(managed by tara1234)
- neither of these are the real initials
- the daughter is an AncestryDNA match to her mother's
probable 4th cousin, but the mother is not (false negative?
fuzzy boundaries?)
- only one of the two kits is at GEDmatch
- GEDmatch alias and email address both begin with Molly
- Molly is the dog's name
- it took me 300 days after the upload to GEDmatch to
associate the AncestryDNA and GEDmatch identities
- Keep all your DNA-related correspondence in a single
searchable email archive
- Use the internal messaging system and
AncestryDNA/MyHeritage/23andMe/LivingDNA or Facebook messages
only to exchange email addresses.
Managing your matches
- Our objectives are:
- to identify the most recent common ancestor or ancestral
couple shared with each DNA match,
- starting with the closest matches, especially those with
shared surnames and/or shared locations,
- thereby confirming that the DNA match is definitely a
documented cousin or closer relative,
- enabling either or both cousins to learn more about their
shared ancestors, and
- confirming or refuting (NPE) the archival and oral evidence
about each cousin's ancestry.
- In practice, this means
- assigning each DNA match and each DNA segment to the most
distant known ancestor through whom you inherited the shared
DNA;
- converting DNA matches into known relatives;
- using
- the tools provided by the DNA comparison websites;
- the tools provided by third parties; and
- your own genealogy database
in order to manage this process.
- My chromosome map
- You will eventually learn to distinguish between genealogically useful, misattributed and false
DNA matches.
Fishing in all the
gene pools
There are a growing number of DNA comparison websites and those
interested in finding long-lost relatives should be in all of
them, especially the largest ones.
Remember Murphy's Law of
Genetic Genealogy, coined while I was helping
an adoptee who is married to a Murphy:
If there are N DNA comparison websites and your DNA is
in N-1 of them, then your most important match will be in the Nth.
In the words of another widely used metaphor, there are many
online gene pools out there and there are many people who are in
only one or two of them; for maximum effect, particularly if you
are trying to find an unknown ancestor who has left no paper
trail, you must fish in all of these pools.
You must spit for the websites which do not allow data uploads:
You must download your data file from the website of whichever
laboratory (or laboratories) you use and upload it to the websites
which do allow data uploads:
- GEDmatch.com
- If you have spat or swabbed for more
then one laboratory and if the laboratories use different
chips (i.e. observe different sets of SNPs), then you must:
- upload the data from all the laboratories to GEDmatch;
- sign up for Tier 1 for at least one month (probably still
USD15/month or USD100/year);
- use the "Combine multiple kits into 1 superkit" option on
the Tier 1 menu to create a "Combined" kit and obtain more
accurate results; and
- use the pencil icon beside the component kits on your home
page to set them to "Private" or "Research" so that they do
not clutter up your match list and those of others.
- Comparing the "Overlap" column in the one-to-many results
for the individual (e.g. T205074) and combined (e.g
VA864386C1) kits will show how much more accurate matches are.
- Some half-identical-by-chance and half-identical-by-omission
false matches will disappear (see here).
- FamilyTreeDNA.com
- MyHeritage.com (upload here)
- LivingDNA.com
You must link your DNA match list and your pedigree chart and share them on
the major autosomal DNA comparison websites:
- As of 23 January 2024, I have:
- 22,969 MyHeritage matches (64 "Favorite DNA Matches"=known
relatives)
- 20,430 AncestryDNA matches (206 "Starred matches"=known
relatives)
- 7,296 FTDNA Family Finder matches (36 "Linked Matches"=known
relatives)
- 3,000 GEDmatch matches (fixed)
- 1,507 23andMe matches (initially fixed at 1,500 with the
option to mark matches for retention) (50 "Favorites"=known
relatives)
- 986 LivingDNA matches
There is no easy mechanism for marking known relatives at
LivingDNA or GEDmatch (where Multi Kit Analysis/Tag Group
Selection might be used).
More matches can arise both from larger databases and from less
strict matching criteria.
A higher proportion of known relatives can arise from smaller match
lists, better genealogical tools, more effort by the researcher,
etc.
- Add DNA information to your genealogy database:
-
- Record the ancestors and cousins confirmed by your DNA in
your genealogy database.
- Use an event field or note tag in your database to track
people who are in both your own database and the DNA
databases.
-
- Add genealogy information to the online DNA databases:
-
- Export a GEDCOM file containing at least the ancestors of
each DNA subject and donate/upload it to all the DNA
websites so that matches can see a pedigree chart.
- Examples of pedigree charts: from Ancestral
Quest, AncestryDNA, ancestry.com,
FamilyTreeDNA and GEDmatch.
- For FamilyTreeDNA.com, include in the GEDCOM file any
known relatives already at FTDNA and the shared ancestors;
FTDNA will use these "linked relationships" to assign other
autosomal DNA matches to the most recent common ancestral
couple of the DNA subject and the linked relationship, and
will display paternal and maternal icons as appropriate in
the match list (example).
- Mark deceased ancestors as such, even if you do not know
the date of death, otherwise they may be deemed living,
privatised, and hidden from DNA matches who are also
descended from them.
- Add yourself to WikiTree.com with details of your DNA Tests,
which will automatically propagate to GEDmatch.
AncestryDNA tools
- Every new kit must be associated with a different email
address.
- Test Settings
- Tree Link
- Download Raw DNA Data
- Sharing Preferences (+ Add a person)
- Match list
- Different, sometimes conflicting, "Predicted relationship"
displays!
- Three binary filters:
- Unviewed (by you or any of your collaborators)
- Common ancestors
(speculative hints,
based on user-donated trees, donated by users who are
encouraged to guess anything they don't know, so to be
treated with caution)
- [The "Messaged" filter has been removed.]
- Notes
- Three dropdown filters:
- Private/public linked/unlinked trees
- Shared DNA
- Groups
- Shows the number of matches in each Group
- My suggestions for using the 25
custom groups (gold star and 24 coloured dots)
- e.g. starred matches = known relatives
- Search Matches
- by Match name or by Surname in matches' trees or by Birth
location in matches' trees, if the location is available on
the dropdown
- search results appear incomplete for new kits
- Sort by "Relationship" (i.e. by shared centiMorgans) or by
"Date" (actual match date invisible!)
- Match page
- Shared Matches (>20cM with both parties)
MyHeritage.com tools
FamilyTreeDNA.com tools
Using autosomal
DNA shared matches, triangulation and phasing
An autosomal DNA match between W and Z is defined by a
list of half-identical regions (HIRs), e.g.
Chr |
B37 Start Pos'n |
B37 End Pos'n |
Centimorgans (cM) |
SNPs |
Segment threshold |
Bunch limit |
SNP Density Ratio |
2 |
13,913,190 |
33,668,178 |
23.6 |
4,645 |
172 |
103 |
0.38 |
3 |
171,684,515 |
189,106,781 |
28.8 |
3,853 |
211 |
126 |
0.36 |
4 |
149,733,739 |
177,340,086 |
29.0 |
5,258 |
192 |
115 |
0.36 |
6 |
37,548,744 |
43,836,930 |
11.5 |
1,747 |
210 |
126 |
0.42 |
- Some half-identical regions will be:
- half-identical by chance (zig-zagging between matches on
paternal and maternal chromosomes); or
- half-identical by omission (if there is no match at a
location not observed by one of the DNA labs).
- Most half-identical regions will contain:
- a shared segment common to one of W's chromosomes
and one of Z's chromosomes;
- and fuzzy boundaries (measurement error).
- Hence, some DNA matches will be false positives.
- Because matching thresholds are arbitrary, some DNA
non-matches will be false negatives.
For every autosomal DNA match, and for every autosomal DNA
half-identical region, one would like to assign both to an
ancestor. In particular:
- a known relative, and the segments shared with
him or her, can be assigned to the most distant known individual
ancestor through whom you know that you are related
to the match;
- a DNA match who is not a known relative can be
assigned to the most distant known individual
ancestor through whom you are likely to be related
to the match;
- a segment shared with a close match can initially be assigned
to the ancestor associated with that close match; but
- segments shared with multiple matches can potentially be
pushed back to more distant ancestors as we narrow down the part
of the pedigree chart from which they came (e.g. a segment
shared with a third cousin as well as a second cousin can be
assigned to a greatgrandparent as well as to a grandparent).
Some genetic genealogists have simple family trees and find it
more intuitive to assign matches and segments to ancestral couples
(the most recent common ancestral couple shared with the match)
rather than individuals, but:
- this approach ceases to be equivalent if there are half
relationships or double relationships; and
- each segment shared with the match is shared with only one of
the most recent common ancestral couple.
For example:
- you share two grandparents with your paternal first cousins;
but
- all of the DNA segments that you share with your paternal
first cousins descended to you through your father:
- some from your paternal grandfather; and
- the rest from your paternal grandmother; so
- the segment is inherited from [grandfather OR grandmother],
NOT from [grandfather AND grandmother].
My own family tree has numerous recent complications which force
me to think in terms of individuals rather than couples:
- my paternal grandfather was an identical twin;
- the identical twins married two sisters;
- the sisters' father married twice; and
- his two wives were first cousins.
Matches who are not known relatives can be tentatively assigned
to ancestors (or predicted) based on
- shared matches (on all the main DNA comparison websites);
and/or
- triangulated matches (on all the main DNA comparison websites
except ancestry.com).
I recommend using:
- stars (AncestryDNA, 23andMe, MyHeritage) to distinguish
- matches who are known relatives from
- matches who are not known relatives;
- coloured dots for ancestors:
- MyHeritage
provides:
- 30 coloured dots (Labels), exactly enough (in the
absence of pedigree collapse) for:
- 2 parents;
- 4 grandparents;
- 8 greatgrandparents; and
- 16 GGgrandparents.
- filtering by multiple labels selects matches with label A
OR label B
- AncestryDNA
provides:
- built-in groups for "Parent 1's side" and "Parent 2's
side" and for "Paternal side" and "Maternal side"
- the user is left to start Relationship assignment and
confirm which parent is which.
- 24 coloured dots (Groups), exactly enough for
- 0 grandparents;
- 8 greatgrandparents; and
- 16 GGgrandparents.
- filtering by multiple groups selects matches in group A
AND group B
- DNApainter
provides:
- an unlimited number of colour-coded groups
So I use different methodologies:
- on MyHeritage:
- every known relative has a star and one dot (for the most
distant individual ancestor, back to GGgrandparent, through
whom I am related)
- other DNA matches have a dot for every ancestor with whose
descendant there is a triangulated match
- but I remove more recent dots when I find triangulated
matches with more distant ancestors
- matches attributed to more distant ancestors than a
GGgrandparent have additional notes to compensate for the
limited number of dots
- on AncestryDNA:
- every known relative has a star and one dot for each
GGgrandparent from whom he or she is descended or through whom
I am related:
- fourth cousins (and more distant) have one dot
- third cousins (etc.) have two dots
- second cousins (etc.) have four dots
- first cousins (etc.) have eight dots
- siblings (etc.) have sixteen dots
- other DNA matches
- are in the groups for "Paternal side" and "Maternal side"
if I cannot attribute shared matches to an ancestor
beyond my parents
- just have a note if I cannot attribute shared
matches to an ancestor beyond my grandparents
- have a greatgrandparent dot if they have shared
matches attributed to a greatgrandparent
- have a GGgrandparent dot if they have shared
matches attributed to a GGgrandparent
- have a GGgrandparent dot and a note if they have shared
matches attributed to a more distant ancestor
- but I again remove more recent dots when I find shared
matches with more distant ancestors
- on DNApainter:
- I use a similar philosophy
- A star emoji can be
copied and pasted into the Match Name to mark known relatives.
A counter-example
Matches who end up with multiple dots (e.g. four GGgrandparents):
- may be quite closely related to you; or
- may be from the same geographical area and related to several
of your ancestors just by coincidence.
There is an exception to every rule: not only shared but even
triangulated matches can sometimes arise by coincidence.
Consider these three marriages in the United Church of England and
Ireland (the established church) in Kilkeedy, County Limerick:
- My GGGgrandfather Thomas Parker married my GGGgrandmother Mary
Keas on 14 Sep 1831
- Thomas's brother Francis Parker married Margaret Smith on 3
Mar 1840
- Mary's sister Ellen Keas married Joseph Smith on 26 May 1841
- The two Smiths were also siblings.
- I imagine these three couples sitting around a circular dinner
table, each person with a spouse to one side and a sibling to
the other.
- These three couples produced three families, each of which
were first cousins to the other two, but who didn't have a
single common ancestral couple!
- If one took DNA from one member of each of these three
families, two Parkers and a Smith, then each of them as first
cousins would share many half-identical regions with both of the
others.
- There would be some regions in which:
- the two Parkers had an identical segment from a Parker
grandparent,
- one of the Parkers and the Smith would have an identical
segment from a Smith grandparent, and
- the other Parker and the Smith would have an identical
segment from a Keas grandparent.
- In this example, the three cousins will all "match" each other
genetically, but on closer examination it will be found that
there is no common ancestral couple of all three.
However, these close triangular marriage patterns are very rare.
You may still find two distant relatives, related to you through two
different ancestors, appearing as shared matches, because they are
descended from a third common ancestral couple whom they share with
each other, but whom neither shares with you.
Shared, or In
Common With (ICW), matches
A group of three or more individuals who all meet the relevant
matching criteria with each other are likely to share a recent common ancestor (or, more
often, ancestral couple).
The matches shared with a new match are usually the first clue to solving this
puzzle of assigning the new match to an ancestor.
All of the DNA comparison
websites allow one to identify the shared matches of two individuals in some form or
another.
The matching criteria vary from one DNA comparison website to
another.
The stricter the matching criteria, the more significant the
shared matches.
FTDNA Family Finder
To find the shared matches of two individuals who match each
other:
- go to the match list of one of the individuals
- find the In Common/Not In Common icon opposite the name on the
right of the screen
- select In Common With from the dropdown
- matches are sorted by closeness of relationship to the
logged-in individual
- A can see that B matches C but cannot see the cM shared by B
and C.
- Many of the hidden shared cM figures will be as low as 8cM,
which even FTDNA does not use for family matching based on
assigned relationships.
- A's shared matches with B will be the same as B's shared
matches with A, but in a different order.
You will eventually identify a group of individuals, all of whom
you suspect descend from a single common ancestor (or ancestral
couple).
To see whether up to 10 individuals who match you also match each
other:
- Add them to the Selected Matches box on the Family Finder - Matrix page
- To find the desired surname in the Matches box, click on any
match and start typing the surname
- After finding the surname, ctrl-click on the desired
individuals
- Click the Add>> button to move all the selected
individuals to the Selected Matches box
To find the shared matches and shared cM of two or more individuals
who belong to the same project, whether or not they match each
other, it was formerly possible to:
However, the shared matches option was removed from this menu in
2021 and it is unclear whether or when it may be restored.
AncestryDNA
On each match page, there is a Shared Matches link. (Screenshot.)
The Shared Matches are those with Shared DNA of 20 cM or more
with both individuals.
So C can appear in the shared matches of A and B even if B does
not appear in the shared matches of A and C (if C shares more than 20cM with A but B
shares less than 20cM
with A).
Matches are sorted by closeness of relationship to the logged-in
individual.
A can see that B matches C but cannot see the cM shared by B and
C.
MyHeritage
When the Review DNA Match page
eventually completes loading, the Shared DNA Matches section:
- reveals that "you share the following ... DNA Matches"
- lists the top 10 shared matches
- allows further shared matches to be slowly loaded, 10 at a
time
- may demand money, depending on when you uploaded or what
subscriptions you have purchased
Matches are sorted by the sum
of the centiMorgans shared with the two individuals.
A can see not only that B matches C but also the cM shared by B and
C.
A's shared matches with B will be exactly the same as B's shared
matches with A, in the same order.
GEDmatch
To find the shared matches of two individuals, whether or not
they match each other, use the "People who match both, or 1 of 2
kits" tool on the main menu.
This lists shared matches of Kit 1 and Kit 2, but the user sets
the cM threshold of largest segment and cM threshold of
total matching segments.
Matches are sorted by closeness of relationship to Kit 1.
To sort by closeness of relationship to Kit 2, re-use the tool with
the kits in reverse order.
The Multi Kit Analysis menu
(now available only to Tier 1 subscribers) can run an Autosomal
Matrix Comparison on up to 100 kits.
While the FTDNA user matrix shows only whether or not kits are
deemed to match, the FTDNA administrator matrix and the GEDmatch
matrix both show the shared centiMorgans.
Triangulated matches
Triangulation and phasing are really opposite
sides of the same coin. If V is half-identical on the same
region with W and Z, then there are two possibilities:
- W and Z are half-identical to each other on this region, in
which case V, W and Z probably inherited an identical segment in
this region from a single common ancestor and the relationship
can be described as triangulated;
or
- W and Z are not
half-identical to each other on this region, in which case V is
probably related to W on V's paternal side and V is probably
related to Z on V's maternal side, or vice versa, and V's autosomal DNA in this
region can be phased.
The ADSA tool by Don Worth at DNAGedcom
provides a graphical representation of triangulation
and phasing, similar to DNApainter.
Overlapping matches can represent:
- triangulated match: one segment shared by all three
parties, inherited from a common ancestor of all three;
- phased match: two segments, one paternal, one maternal,
each shared by two of the three parties;
- non-matches: an overlap smaller than the relevant
matching threshold, which might be triangulated or phased or
half-identical by chance, if it could be examined at a lower
matching threshold;
- uncomparable: data copied from different websites for
individuals who have not uploaded their data to all the
websites.
Triangulation groups
The ultimate objective is to collect DNA matches into triangulation groups. A
triangulation group is a set of three or more people who are all
half-identical to each of the other group members on overlapping
regions.
The more individuals who are added to the triangulation group,
the smaller the overlap may become.
A triangulation group of three or more individuals are very likely to share a recent
common ancestor (or, more often, ancestral couple).
The triangulated matches that I share with a new match are
usually the second clue
to identifying through which of my most distant known ancestors I
am related to the new match. Some
of the DNA comparison websites allow one to identify the triangulated matches of two
individuals in some form or another.
FTDNA Family Finder
One had to be a little devious to find triangulated matches
directly at FTDNA.
The Linked Matches feature (previously known as Assigned
Relationship and before that as Linked Relationship, and currently hidden on
the "Sort by" menu) was designed to identify matches who triangulate
with known relatives, assign them to the most distant individual
ancestor through whom they are related to the DNA subject, and then
dump them all together again in paternal and maternal buckets, even
if they have been assigned to an ancestor more distant than the
father or mother.
You may want to have two Family Finder kits, e.g. a kit based on
swabs sent to FTDNA with a full pedigree chart for paternal/maternal
phasing; and a kit based on an autosomal transfer from another
laboratory with a minimal pedigree chart for identifying
triangulated matches (e.g. my B95575).
AncestryDNA
AncestryDNA refuses to provide any way of identifying triangulated
matches.
MyHeritage
MyHeritage shared match lists
include a symbol identifying which of these matches are
triangulated.
Make sure that you have not opted out of showing shared segments.
There appears to be no way to filter the list of shared matches
to show only the triangulated matches.
GEDmatch
- The Tier 1 Multi Kit Analysis
(MKA) menu allows you to:
- generate an autosomal DNA comparison matrix
- search for triangulations involving Kit 1 and any two or
more of Kits 2,3,4,...
Other third party tools
and websites
The major DNA websites do not like the load imposed on their servers
by the more powerful tools designed by third party developers.
Y-DNA and surname
projects
Men and our surnames are defined by genetic signatures comprising
chronological sequences of SNP mutation labels like:
- R-M269>L23>L51>L151>U106>Z381>Z301>L48>Z9>Z30>Z27>Z345>Z2>Z7>Z8>Z338>Z11>Z12>Z8175>FGC12057>Z383>FGC29367>BY14499
which have superceded the original Simplified Tree of Y-Chromosome Haplogroups.
FTDNA currently gives:
- a most recent confirmed SNP to men who purchase Big Y-700; or
- an older SNP to men who purchase only Y-STR products and/or
Family Finder (still being rolled out to the latter customers)
The sequence of SNPs can be explored by:
- climbing back through time with the Discover tool,
found by clicking on the "Confirmed Y-DNA Haplogroup" Badge at
the bottom of the right-hand column of the Home/dashboard page.
- climbing forward through time with the public Y-DNA Haplotree.
FTDNA uses the CE/BCE notation in place of the AD/BC
notation for its estimated years.
- Y-DNA tools and websites
- The Big Tree (like GEDmatch for men who have bought Big
Y-700)
- Y-DNA projects can be
- Once you have your initial Y-DNA results (or a known male-line
relative's Y-DNA results), you can join appropriate haplogroup
projects.
- Some older project member and project administrator features
have been disabled because of numerous changes prompted by GDPR
fears:
- You must Opt in to Sharing on the PROJECT PREFERENCES page or your
pseudonymized DNA results and ancestor information will be
missing from the public results pages.
- You can also choose from that page whether to give each
project administrator Minimum, Limited or Advanced access to
your kit; reducing access to Minimum pretty much eliminates
all the benefits of project membership.
- It is also recommended that you set Y-DNA Match Levels to
All Levels on the PRIVACY & SHARING page.
- If there is no surname project for your surname and you are
happy to deal with the spam risk, then you can apply to set up
your own project by following a simple five-step application process (which actually
consists of only four steps!).
- Every project has an activity feed for discussions between
members and administrators, which can be used by administrators
to avoid having to answer the same frequently asked questions
repeatedly via individual emails.
- Project administrators have valuable tools, including:
- a subgroup editor to arrange members on
the Y-DNA results pages, e.g. Clare Roots or Clancy surname
- subgroups are sorted alphabetically on the results pages,
so bear this in mind when choosing names
- to force subgroups into a desired order, number them with
leading zeroes, 001, 002, 003, etc.
- criteria for grouping can include:
- surname
- geography
- haplotree position, whether
- confirmed by FTDNA
- predicted by FTDNA
- predicted by project administrator
- desire to see STR differences highlighted
- Subgroup Names (which are visible on the results pages)
were originally truncated at 161 characters, without
warning. In November 2021, project administrators were
informed that the character limit for the Subgroup Name
field for new subgroups is now 200 characters. Existing
subgroups could not be renamed to more than 161 characters,
but all the members could be moved to a new subgroup with a
longer name. So keep these names as short as possible with
no unnecessary spacing or punctuation.
- Subgroup Descriptions (which are visible to the project
administrator(s) only) appear to be truncated at 973
characters, without warning, and despite the false assurance
of scroll bars in the editor.
- a Y-DNA genetic distance calculator:
- this has greater thresholds than the matching algorithm:
7/37 instead of 4/37; 25/67 instead of 7/67 and 40/111
instead of 10/111
- examples: R-M222 for a man with one Y-DNA37
match with no SNP test; R-FGC29367 for a man with no Y-DNA111
match.
- a public website editor to publish information under any or
all of the following headings:
- Background
- Goals
- News
- Updates
- Bulletin
- Results
- Code of Conduct
- FAQ
- Project members can be recruited in many ways:
- FTDNA will send an email on behalf of an administrator, no
more than once every six months, to all customers with the
relevant surname who have opted to receive such emails.
- Administrators can see project members' matches and can
email them directly to invite them to join.
- A clan or surname organisation or one-name-study is
ideally positioned to run online and offline recruitment
drives.
- See here for all the technical details
of how and why to upload your DNA data and pedigree charts to
the various websites.