Genetic Genealogy
6 p.m. Tuesday 12 November 2019 and Tuesday 19
November 2019
Room 5052, Arts Building, TCD
WWW version:
YouTube version:
***** NB: FamilyTreeDNA kits will be
available after this talk for anyone interested via the DNA
Outreach IRL project *****
Review of beginners'
offspring |
offspring |
sperm |
Y chromosome |
X chromosome |
22 paternal autosomal chromosomes |
egg |
X chromosome |
22 maternal autosomal chromosomes |
mitochondria |
component |
path |
by |
Y chromosome |
From father only (and only if male) |
males only |
autosomal chromosomes (autosomes) |
Equally from both parents |
everyone |
X chromosome(s) |
Unequally from both parents |
males x1, females x2 |
mitochondrial DNA |
From mother only |
everyone |
- Without recombination and mutations, all of us would have
identical DNA.
- Autosomal DNA is the cheapest component to analyse and has
rapidly become the most widely used in genealogy:
- cousin matching using autosomal DNA will identify
out to third cousin on all sides of your family, and may identify more
distant relationships;
- the technology cannot separate the paternal and maternal
- identical twins, parent/child relationships and most
full-sibling relationships can be identified unambiguously;
- for more distant relationships, probabilities can be assigned to
the various possibilities.
- Y-DNA has been widely used for one name studies or surname
projects for
much longer, but has also seen rapid recent scientific advances:
- cousin matching using Y-DNA will identify relationships
men of the same surname and same genetic origin, but may identify
surname/DNA switches and/or more distant
relationships, predating the era of surnames (1014+);
- many surnames have multiple genetic origins, including
occupational surnames (Smith, Miller, Potter, etc.).
- Targeted mitochondrial DNA and X-DNA comparisons can be
used to solve
more specialised
- The DNA companies will turn your spit or swabs into a data
file, which can be compared with data files from the DNA of other
- Our objectives are:
- to identify the most recent common ancestor or ancestral
couple shared with each DNA match,
- starting with the closest matches and those with shared
surnames and/or shared locations,
- thereby confirming that the DNA match is definitely a
documented cousin or
closer relative,
- enabling either or both cousins to learn more about their
shared ancestors, and
- confirming or refuting (NPE) the archival and oral
evidence about each cousin's ancestry.
v. Anonymity
- There is a trade-off between:
- increasing your chances of finding
long-lost cousins and ancestors (and being found by long-lost cousins);
- maintaining the privacy of your family history research
and DNA results.
- If you keep your DNA
results or known family tree private, then nobody will be able to find
you and you will not
be able to find any DNA matches.
- If you want to be found, then you must
let your potential cousins see your DNA results and your known
- If you give your matches no information, then they can not
help you.
- Some customers of the DNA companies appear to wish to
maintain a
certain degree of privacy and anonymity.
- Others find it paradoxical that
those trying to identify their anonymous ancestors can be so concerned
about anonymising their own identity.
- (for all customers) and (for
customers who opt in) explicitly allow familial
comparisons of DNA recovered from a crime scene to identify a
perpetrator of a violent crime against another individual.
- Using the MyHeritage DNA Services for law
enforcement purposes ... is currently "strictly prohibited".
The basic rules for successful use of the DNA websites include the
- Reveal the DNA subject's birth surname:
- Most people inherit DNA with
their birth surname, so identify yourself as a minimum by
your birth surname with an initial or a title, e.g., P Waldron or Mr
Waldron or Miss Durkan.
- Reveal the gender of the person who provided the DNA sample:
- A woman does not have a Y chromosome, so may ask a male
relative with the relevant surname to swab:
a father, brother, nephew, cousin, etc.,
if her interest is in her maiden
surname; or
a husband, son, brother-in-law,
father-in-law, etc., if her interest is in her married surname.
Valuable additional
inferences can potentially be drawn once it is known whether two X
chromosomes (female) or one X chromosome and one Y chromosome (male)
are potentially available for comparison.
You must NOT attach
female name to a male DNA sample (or vice versa), as
this causes untold confusion.
Be especially careful not to inadvertently link a male's Y-DNA results
with a female's autosomal DNA upload at where
error-checking does not look for this.
Also take care not to link a male DNA sample to a female's pedigree
chart (or vice versa).
- Avoid providing irrelevant information:
- Your first name, married
surname, adopted
surname or marital status reveal nothing about your DNA, so you may
keep these private if you wish.
- Avoid pseudonyms:
- They reduce the chances that your matches will bother to
look at your family
tree, contact you or share the information about your ancestry that
they have and that you do not have.
- Use a photograph:
- If you upload a photograph to your AncestryDNA account
before you receive your initial results, then the photograph
(hyperlinked to the match details) will appear on the AncestryDNA Insights page of all
your matches as long as you remain in their eight newest matches with
- Be consistent and avoid unnecessary confusion:
- A real example (further anonymised):
- Ancestry username: tara1234
- AncestryDNA samples from mother and daughter (per
- linked to pedigree charts of an aunt and niece
- appear to matches as M.R. (managed by tara1234) and
D.C. (managed by tara1234)
- neither of these are the real initials
- the daughter is an AncestryDNA match to her mother's
probable 4th cousin, but the mother is not (false negative? fuzzy
- only one of the two kits is at GEDmatch
- GEDmatch alias and e-mail address both begin with Molly
- Molly is the dog's name
- it took me 300 days after the upload to GEDmatch to
associate the AncestryDNA and GEDmatch identities
- Keep all your DNA-related correspondence in a single
searchable e-mail archive
- Use the internal messaging system and
AncestryDNA/MyHeritage/23andMe or Facebook messages only to exchange
e-mail addresses.
in all the gene pools
There are a growing number of DNA comparison websites and
those interested in finding long-lost relatives should be in all of
them. While helping an adoptee who is married to a Murphy, I coined
what I have called Murphy's
Law of Genetic Genealogy:
If there are N DNA comparison websites and your DNA
is in N-1 of them, then your most important match will be in the Nth.
In the words of another widely used metaphor, there
are many
online gene pools out there and there are many people who are in only
one or two of them; for maximum effect, particularly if you are trying
to find an unknown ancestor who has left no paper trail, you must fish
in all of these
You must spit for the websites which do not allow data uploads:
You must download your data file from the website of whichever
laboratory you use and upload it to the websites which
do allow data uploads:
You must link your DNA
match list and
your pedigree chart
and share them on the major autosomal DNA comparison websites:
- Add DNA information to your genealogy database:
- Record the ancestors and cousins confirmed by your DNA
in your genealogy
- Use an event field or note tag in your database to
people who are in both your own database and the DNA databases.
- Add genealogy information to the online DNA databases:
- Export a GEDCOM file containing at least the ancestors
of each DNA subject and upload it to all the DNA websites so that
matches can see a pedigree chart.
- Examples of pedigree charts: from Ancestral
Quest, AncestryDNA,,
and GEDmatch.
- For, include in the GEDCOM file any third cousins or
closer already at FTDNA and the shared ancestors; FTDNA will use these
linked relationships to assign other matches to the DNA subject's
paternal and
maternal sides (example).
- Mark deceased ancestors as such, even if you do not
know the date of death, otherwise they may be deemed living,
privatised, and hidden from DNA matches who are also descended from
These tools are free to all users:
- User Registration
- Generic Uploads (23andme, FTDNA, AncestryDNA, most others)
- Upload GEDCOM (Fast)
- 'One-to-many' matches
- 'One-to-one' compare
- People who match one or both of 2 kits
- Are your parents related? (e.g. T409076)
If you login in one browser tab, then you
can open
these Tier 1 tools (USD10/month) in another tab:
- the Multi Kit Analysis menu allows you
- generate an autosomal DNA comparison matrix
- search for triangulations involving Kit 1 and any two
or more of Kits 2,3,4,...
- the Triangulation
tool allows you to find all triangulation groups for the selected kit
at the selected thresholds
- the Segment Search tool allows more
flexible manual investigation of phasing and triangulation
- the Lazarus tool attempts to resurrect
the DNA of a deceased ancestor
- Every new kit must be associated with a different e-mail
- Test Settings
- Tree Link
- Download Raw DNA Data
- Sharing Preferences (+ Add a person)
- Match list
- Sort by "Relationship" (not as exact as appears) or
"Date" (actual match date invisible!)
- Groups
- Shows the number of matches in each Group
- Suggestions for using the 25
custom groups (gold star and 24 coloured dots)
- e.g. starred matches = known relatives
- Filters
- Does not show the number of matches in each Filter
- Common ancestors (speculative hints, to be treated
- Search Matches
- by Match name or by Surname in matches'
trees or by Birth location in matches' trees, if
the location is available on the dropdown
- search results appear incomplete for new kits
- Match page
third party tools and websites
- Autosomal tools and websites
- Y-DNA tools and websites
autosomal DNA shared matches, triangulation and phasing
or In Common With (ICW), matches
A group of three or more individuals who all meet the relevant
criteria with each other are likely
to share a recent common ancestor (or,
more often, ancestral couple).
When I find a new match, I am usually anxious to identify
the most distant known ancestor through whom I am related to
new match.
matches that I share with the new match are usually the first clue to
solving this puzzle.
of the DNA
comparison websites allow one to identify the shared matches of
two individuals in
some form
or another.
The matching criteria vary from one DNA comparison website to
The stricter the matching criteria, the more significant the
shared matches.
Family Finder
To find the shared matches of two individuals who match each
- go to the match list of one of the
- tick the box opposite the other individual
- click the In Common With button on the 5th line from the
top of the
- matches are sorted by closeness of relationship to the
logged-in individual
- A can see that B matches C but cannot see the cM shared by
B and C.
- A's shared matches with B will be the same as B's shared
matches with A, but in a different order.
You will eventually identify a group of individuals, all of
whom you
suspect descend from a single common ancestor (or ancestral couple).
To see whether up to 10 individuals who match you also match
each other:
- Add them to the Selected Matches box on the Family Finder - Matrix page
- To find the desired surname in the Matches box, click on
any match and start typing the surname
- After finding the surname, ctrl-click on the desired
- Click the Add>> button to move all the
selected individuals to the Selected Matches box
To find the shared matches and shared cM of two or more individuals who
belong to the same project, whether or not they match each other:
On each match page, there is a Shared Matches link. (Screenshot.)
The Shared Matches are those with Shared DNA of 20 cM or more
with both individuals.
So C can appear in the shared matches of A and B even if B
does not appear in the shared matches of A and C (if C shares more than
20cM with A but B shares less
than 20cM with A).
Matches are sorted by closeness of relationship to the
logged-in individual.
A can see that B matches C but cannot see the cM shared by B
and C.
When the Review DNA Match page eventually
completes loading, the Shared DNA Matches section:
- reveals that "you share the following 1,532 DNA Matches"
- lists the top 10 shared matches
- allows further shared matches to be slowly loaded, 10 at a
- may demand money, depending on when you uploaded or what
subscriptions you have purchased
Matches are sorted by the sum
of the centiMorgans shared
with the two individuals.
A can see not only that B matches C but also the cM shared by B and C.
A's shared matches with B will be exactly the same as B's shared
matches with A, in the same order.
To find the shared matches of two individuals, whether or not
they match each
other, use the "People who match both, or 1 of 2 kits" tool on the many
This lists shared
matches of Kit 1 and Kit 2, no matter how much DNA or how little DNA
Kit 1 shares with
Kit 2.
Matches are sorted by closeness of relationship to Kit 1.
To sort by closeness of relationship to Kit 2, re-use the tool with the
kits in reverse order.
If you login in one browser tab, and open
the Multi Kit Analysis menu in another
tab (via a hyperlink or bookmark), then you can run an Autosomal Matrix
Comparison on up to 100 kits.
While the FTDNA user matrix shows only whether or not kits match, the
FTDNA administrator matrix and the GEDmatch matrix shows the shared
and phasing
really opposite sides of the same coin. If V is
half-identical on the same region with W and Z, then there are two
- W and Z are half-identical to each other on this region, in
which case V, W and Z probably inherited an identical segment in this
region from a single common ancestor and the relationship can be
described as triangulated;
- W and Z are not
half-identical to each other on this region, in which case V is
related to W on V's paternal side and V is probably related to Z on V's
maternal side, or vice
versa, and V's autosomal DNA in this region can be phased.
The ADSA tool by Don Worth at DNAGedcom
provides a graphical
representation of triangulation and phasing.
The ultimate objective is to collect DNA matches into triangulation groups.
A triangulation group is a set of three or more people who are all
half-identical to each of the
other group members on overlapping regions.
The more individuals who are added to the triangulation group,
the smaller the overlap may become.
A triangulation group of three or more individuals are very likely to share
a recent common ancestor (or,
more often, ancestral couple).
he triangulated matches that I share with a
new match are
usually the second
clue to identifying through which of my most distant
known ancestors I am related to the new match.
Some of the
comparison websites allow one to identify the triangulated matches
of two individuals
in some form
or another.
Family Finder
One had to be a little devious to find triangulated matches
directly at
FTDNA. The methodology may change when the current website changes are
Your family tree is currently updating to a new
version. • Your DNA Matches
may be missing their
family tree links because their family tree has not yet been updated to
the new version. Once a match’s family tree is successfully updated,
this function will become available. All customer trees should be
updated to the new version by November 30th.
The Linked Relationship feature was
designed to identify matches who
triangulate with known third cousins or closer, and then dump them all
together again in paternal and maternal buckets.
If the new family tree system is similar to the old, then you may
want to have two Family Finder
kits, e.g. a kit based on swabs sent to FTDNA with a full pedigree
chart for paternal/maternal phasing; and a kit based on an autosomal
transfer from another laboratory
with a minimal pedigree chart for
identifying triangulated
matches (e.g. my B95575).
AncestryDNA refuses to provide any way of identifying triangulated
MyHeritage shared match lists include a
symbol identifying
which of these matches are triangulated.
There appears to be no way to filter the list of shared
matches to show only the triangulated matches.
To use the triangulation tools, you are expected to subscribe
to Tier 1 (USD10 for one month).
If you login in one browser tab, then you
can open
these Tier 1 tools in another tab:
- the Multi Kit Analysis menu allows you
to search for triangulations involving Kit 1 and any two
or more of Kits 2,3,4,...
- the Triangulation
tool allows you to find all triangulation groups for the selected kit
at the selected thresholds
- the Segment Search tool allows more
flexible manual investigation of phasing and triangulation
and surname projects
- Y-DNA projects can be
- Once you have your initial Y-DNA
results (or a known male-line relative's Y-DNA results), you can join
appropriate haplogroup projects.
- Some older project member and project administrator
features have been
disabled because of numerous changes prompted by GDPR fears:
- You must Opt in to Sharing on the PROJECT PREFERENCES page or your
pseudonymized DNA results and ancestor information will be missing from
the public results pages.
- You can also choose from that page whether to give each
administrator Minimum, Limited or Advanced access to your kit; reducing
access to Minimum pretty much eliminates all the benefits of project
- It is also recommended that you set Y-DNA Match Levels to
All Levels on
- If there is no surname project for your surname and you are
happy to
deal with the spam risk, then you can apply to
set up your own project by following a simple five-step application process (which
actually consists of only four steps!).
- Every project has an activity feed for discussions
members and administrators, which can be used by administrators to
avoid having to answer the same frequently asked questions repeatedly
via individual e-mails.
- Project administrators have valuable tools, including:
- a subgroup editor to arrange members
on the Y-DNA results
- subgroups are sorted alphabetically on the results
pages, so bear this in mind when choosing names
- criteria for grouping can include:
- surname
- geography
- haplotree position, whether
- confirmed by FTDNA
- predicted by FTDNA
- predicted by project administrator
- desire to see STR differences highlighted
- Many of the colours available for distinguishing
subgroups don't work
very well, or at least my eyesight isn't good enough to use them, as
the background colours are too close to the text colour.
- Subgroup Names (which are visible on the results pages)
appear to be
truncated at 161 characters, without warning. So keep these names as
short as possible with no unnecessary spacing or punctuation.
- Subgroup Descriptions (which are visible to the project
administrator(s) only) appear to be truncated at 973 characters,
without warning, and despite the false assurance of scroll bars in the
- a Y-DNA genetic distance calculator:
- this has greater thresholds than the matching
7/37 instead of
4/37; 25/67 instead of 7/67 and 40/111 instead of 10/111
- examples: R-M222
for a man with one Y-DNA37
match with no SNP test; R-FGC29367
for a man with no
Y-DNA111 match.
- a public website editor to publish information under any
or all of the following headings:
- Background
- Goals
- News
- Updates
- Bulletin
- Results
- Code of Conduct
- Project members can be recruited in many ways:
- FTDNA will send an e-mail on behalf of an
administrator, no
more than once every six months, to all customers with the relevant
surname who have opted to receive such e-mails.
- Administrators can see project members' matches and can
e-mail them directly to invite them to join.
- A clan or surname organisation or one-name-study is ideally positioned to run
online and offline recruitment drives.
- See here for all the technical details
of how and why to
upload your DNA data and pedigree charts to the various websites.