Genetic Genealogy
6 p.m. Tuesday 12 November 2019 and Tuesday 19
November 2019
Room 5052, Arts Building, TCD
WWW version:
YouTube version:
***** NB: FamilyTreeDNA kits will be
available after this talk for anyone interested via the DNA
Outreach IRL project *****
Outline
Review of beginners'
session
|
male
offspring |
female
offspring |
sperm |
Y chromosome |
X chromosome |
22 paternal autosomal chromosomes |
egg |
X chromosome |
22 maternal autosomal chromosomes |
mitochondria |
DNA
component |
Inheritance
path |
Inherited
by |
Y chromosome |
From father only (and only if male) |
males only |
autosomal chromosomes (autosomes) |
Equally from both parents |
everyone |
X chromosome(s) |
Unequally from both parents |
males x1, females x2 |
mitochondrial DNA |
From mother only |
everyone |
- Without recombination and mutations, all of us would have
identical DNA.
- Autosomal DNA is the cheapest component to analyse and has
rapidly become the most widely used in genealogy:
- cousin matching using autosomal DNA will identify
relationships
out to third cousin on all sides of your family, and may identify more
distant relationships;
- the technology cannot separate the paternal and maternal
autosomes;
- identical twins, parent/child relationships and most
full-sibling relationships can be identified unambiguously;
- for more distant relationships, probabilities can be assigned to
the various possibilities.
- Y-DNA has been widely used for one name studies or surname
projects for
much longer, but has also seen rapid recent scientific advances:
- cousin matching using Y-DNA will identify relationships
with
men of the same surname and same genetic origin, but may identify
surname/DNA switches and/or more distant
relationships, predating the era of surnames (1014+);
- many surnames have multiple genetic origins, including
occupational surnames (Smith, Miller, Potter, etc.).
- Targeted mitochondrial DNA and X-DNA comparisons can be
used to solve
more specialised
problems.
- The DNA companies will turn your spit or swabs into a data
file, which can be compared with data files from the DNA of other
individuals:
- Our objectives are:
- to identify the most recent common ancestor or ancestral
couple shared with each DNA match,
- starting with the closest matches and those with shared
surnames and/or shared locations,
- thereby confirming that the DNA match is definitely a
documented cousin or
closer relative,
- enabling either or both cousins to learn more about their
shared ancestors, and
- confirming or refuting (NPE) the archival and oral
evidence about each cousin's ancestry.
Identity
v. Anonymity
- There is a trade-off between:
- increasing your chances of finding
long-lost cousins and ancestors (and being found by long-lost cousins);
and
- maintaining the privacy of your family history research
and DNA results.
- If you keep your DNA
results or known family tree private, then nobody will be able to find
you and you will not
be able to find any DNA matches.
- If you want to be found, then you must
let your potential cousins see your DNA results and your known
ancestors.
- If you give your matches no information, then they can not
help you.
- Some customers of the DNA companies appear to wish to
maintain a
certain degree of privacy and anonymity.
- Others find it paradoxical that
those trying to identify their anonymous ancestors can be so concerned
about anonymising their own identity.
- FamilyTreeDNA.com (for all customers) and GEDmatch.com (for
customers who opt in) explicitly allow familial
comparisons of DNA recovered from a crime scene to identify a
perpetrator of a violent crime against another individual.
- Using the MyHeritage DNA Services for law
enforcement purposes ... is currently "strictly prohibited".
The basic rules for successful use of the DNA websites include the
following:
- Reveal the DNA subject's birth surname:
- Most people inherit DNA with
their birth surname, so identify yourself as a minimum by
your birth surname with an initial or a title, e.g., P Waldron or Mr
Waldron or Miss Durkan.
- Reveal the gender of the person who provided the DNA sample:
- A woman does not have a Y chromosome, so may ask a male
relative with the relevant surname to swab:
a father, brother, nephew, cousin, etc.,
if her interest is in her maiden
surname; or
a husband, son, brother-in-law,
father-in-law, etc., if her interest is in her married surname.
Valuable additional
inferences can potentially be drawn once it is known whether two X
chromosomes (female) or one X chromosome and one Y chromosome (male)
are potentially available for comparison.
You must NOT attach
a
female name to a male DNA sample (or vice versa), as
this causes untold confusion.
Be especially careful not to inadvertently link a male's Y-DNA results
with a female's autosomal DNA upload at FamilyTreeDNA.com where
error-checking does not look for this.
Also take care not to link a male DNA sample to a female's pedigree
chart (or vice versa).
- Avoid providing irrelevant information:
- Your first name, married
surname, adopted
surname or marital status reveal nothing about your DNA, so you may
keep these private if you wish.
- Avoid pseudonyms:
- They reduce the chances that your matches will bother to
look at your family
tree, contact you or share the information about your ancestry that
they have and that you do not have.
- Use a photograph:
- If you upload a photograph to your AncestryDNA account
before you receive your initial results, then the photograph
(hyperlinked to the match details) will appear on the AncestryDNA Insights page of all
your matches as long as you remain in their eight newest matches with
photographs.
- Be consistent and avoid unnecessary confusion:
- A real example (further anonymised):
- Ancestry username: tara1234
- AncestryDNA samples from mother and daughter (per
e-mail
exchange)
- linked to pedigree charts of an aunt and niece
- appear to matches as M.R. (managed by tara1234) and
D.C. (managed by tara1234)
- neither of these are the real initials
- the daughter is an AncestryDNA match to her mother's
probable 4th cousin, but the mother is not (false negative? fuzzy
boundaries?)
- only one of the two kits is at GEDmatch
- GEDmatch alias and e-mail address both begin with Molly
- Molly is the dog's name
- it took me 300 days after the upload to GEDmatch to
associate the AncestryDNA and GEDmatch identities
- Keep all your DNA-related correspondence in a single
searchable e-mail archive
- Use the internal messaging system and
AncestryDNA/MyHeritage/23andMe or Facebook messages only to exchange
e-mail addresses.
Fishing
in all the gene pools
There are a growing number of DNA comparison websites and
those interested in finding long-lost relatives should be in all of
them. While helping an adoptee who is married to a Murphy, I coined
what I have called Murphy's
Law of Genetic Genealogy:
If there are N DNA comparison websites and your DNA
is in N-1 of them, then your most important match will be in the Nth.
In the words of another widely used metaphor, there
are many
online gene pools out there and there are many people who are in only
one or two of them; for maximum effect, particularly if you are trying
to find an unknown ancestor who has left no paper trail, you must fish
in all of these
pools.
You must spit for the websites which do not allow data uploads:
You must download your data file from the website of whichever
laboratory you use and upload it to the websites which
do allow data uploads:
You must link your DNA
match list and
your pedigree chart
and share them on the major autosomal DNA comparison websites:
- Add DNA information to your genealogy database:
-
- Record the ancestors and cousins confirmed by your DNA
in your genealogy
database.
- Use an event field or note tag in your database to
track
people who are in both your own database and the DNA databases.
- Add genealogy information to the online DNA databases:
-
- Export a GEDCOM file containing at least the ancestors
of each DNA subject and upload it to all the DNA websites so that
matches can see a pedigree chart.
- Examples of pedigree charts: from Ancestral
Quest, AncestryDNA, ancestry.com,
FamilyTreeDNA
and GEDmatch.
- For
FamilyTreeDNA.com, include in the GEDCOM file any third cousins or
closer already at FTDNA and the shared ancestors; FTDNA will use these
linked relationships to assign other matches to the DNA subject's
paternal and
maternal sides (example).
- Mark deceased ancestors as such, even if you do not
know the date of death, otherwise they may be deemed living,
privatised, and hidden from DNA matches who are also descended from
them.
GEDmatch.com
tools
These tools are free to all users:
- User Registration
- Generic Uploads (23andme, FTDNA, AncestryDNA, most others)
- Upload GEDCOM (Fast)
- 'One-to-many' matches
- 'One-to-one' compare
- People who match one or both of 2 kits
- Are your parents related? (e.g. T409076)
If you login in one browser tab, then you
can open
these Tier 1 tools (USD10/month) in another tab:
- the Multi Kit Analysis menu allows you
to:
- generate an autosomal DNA comparison matrix
- search for triangulations involving Kit 1 and any two
or more of Kits 2,3,4,...
- the Triangulation
tool allows you to find all triangulation groups for the selected kit
at the selected thresholds
- the Segment Search tool allows more
flexible manual investigation of phasing and triangulation
- the Lazarus tool attempts to resurrect
the DNA of a deceased ancestor
FamilyTreeDNA.com
tools
AncestryDNA
tools
- Every new kit must be associated with a different e-mail
address.
- Test Settings
- Tree Link
- Download Raw DNA Data
- Sharing Preferences (+ Add a person)
- Match list
- Sort by "Relationship" (not as exact as appears) or
"Date" (actual match date invisible!)
- Groups
- Shows the number of matches in each Group
- Suggestions for using the 25
custom groups (gold star and 24 coloured dots)
- e.g. starred matches = known relatives
- Filters
- Does not show the number of matches in each Filter
- Common ancestors (speculative hints, to be treated
with
caution)
- Search Matches
- by Match name or by Surname in matches'
trees or by Birth location in matches' trees, if
the location is available on the dropdown
- search results appear incomplete for new kits
- Match page
MyHeritage.com
tools
Other
third party tools and websites
- Autosomal tools and websites
- Y-DNA tools and websites
Using
autosomal DNA shared matches, triangulation and phasing
Shared,
or In Common With (ICW), matches
A group of three or more individuals who all meet the relevant
matching
criteria with each other are likely
to share a recent common ancestor (or,
more often, ancestral couple).
When I find a new match, I am usually anxious to identify
the most distant known ancestor through whom I am related to
the
new match.
The
matches that I share with the new match are usually the first clue to
solving this puzzle.
All
of the DNA
comparison websites allow one to identify the shared matches of
two individuals in
some form
or another.
The matching criteria vary from one DNA comparison website to
another.
The stricter the matching criteria, the more significant the
shared matches.
FTDNA
Family Finder
To find the shared matches of two individuals who match each
other:
- go to the match list of one of the
individuals
- tick the box opposite the other individual
- click the In Common With button on the 5th line from the
top of the
window
- matches are sorted by closeness of relationship to the
logged-in individual
- A can see that B matches C but cannot see the cM shared by
B and C.
- A's shared matches with B will be the same as B's shared
matches with A, but in a different order.
You will eventually identify a group of individuals, all of
whom you
suspect descend from a single common ancestor (or ancestral couple).
To see whether up to 10 individuals who match you also match
each other:
- Add them to the Selected Matches box on the Family Finder - Matrix page
- To find the desired surname in the Matches box, click on
any match and start typing the surname
- After finding the surname, ctrl-click on the desired
individuals
- Click the Add>> button to move all the
selected individuals to the Selected Matches box
To find the shared matches and shared cM of two or more individuals who
belong to the same project, whether or not they match each other:
AncestryDNA
On each match page, there is a Shared Matches link. (Screenshot.)
The Shared Matches are those with Shared DNA of 20 cM or more
with both individuals.
So C can appear in the shared matches of A and B even if B
does not appear in the shared matches of A and C (if C shares more than
20cM with A but B shares less
than 20cM with A).
Matches are sorted by closeness of relationship to the
logged-in individual.
A can see that B matches C but cannot see the cM shared by B
and C.
MyHeritage
When the Review DNA Match page eventually
completes loading, the Shared DNA Matches section:
- reveals that "you share the following 1,532 DNA Matches"
- lists the top 10 shared matches
- allows further shared matches to be slowly loaded, 10 at a
time
- may demand money, depending on when you uploaded or what
subscriptions you have purchased
Matches are sorted by the sum
of the centiMorgans shared
with the two individuals.
A can see not only that B matches C but also the cM shared by B and C.
A's shared matches with B will be exactly the same as B's shared
matches with A, in the same order.
GEDmatch
To find the shared matches of two individuals, whether or not
they match each
other, use the "People who match both, or 1 of 2 kits" tool on the many
menu.
This lists shared
matches of Kit 1 and Kit 2, no matter how much DNA or how little DNA
Kit 1 shares with
Kit 2.
Matches are sorted by closeness of relationship to Kit 1.
To sort by closeness of relationship to Kit 2, re-use the tool with the
kits in reverse order.
If you login in one browser tab, and open
the Multi Kit Analysis menu in another
tab (via a hyperlink or bookmark), then you can run an Autosomal Matrix
Comparison on up to 100 kits.
While the FTDNA user matrix shows only whether or not kits match, the
FTDNA administrator matrix and the GEDmatch matrix shows the shared
centiMorgans.
Triangulated
matches
Triangulation
and phasing
are
really opposite sides of the same coin. If V is
half-identical on the same region with W and Z, then there are two
possibilities:
- W and Z are half-identical to each other on this region, in
which case V, W and Z probably inherited an identical segment in this
region from a single common ancestor and the relationship can be
described as triangulated;
or
- W and Z are not
half-identical to each other on this region, in which case V is
probably
related to W on V's paternal side and V is probably related to Z on V's
maternal side, or vice
versa, and V's autosomal DNA in this region can be phased.
The ADSA tool by Don Worth at DNAGedcom
provides a graphical
representation of triangulation and phasing.
Triangulation
groups
The ultimate objective is to collect DNA matches into triangulation groups.
A triangulation group is a set of three or more people who are all
half-identical to each of the
other group members on overlapping regions.
The more individuals who are added to the triangulation group,
the smaller the overlap may become.
A triangulation group of three or more individuals are very likely to share
a recent common ancestor (or,
more often, ancestral couple).
he triangulated matches that I share with a
new match are
usually the second
clue to identifying through which of my most distant
known ancestors I am related to the new match.
Some of the
DNA
comparison websites allow one to identify the triangulated matches
of two individuals
in some form
or another.
FTDNA
Family Finder
One had to be a little devious to find triangulated matches
directly at
FTDNA. The methodology may change when the current website changes are
completed:
Your family tree is currently updating to a new
version. • Your DNA Matches
may be missing their
family tree links because their family tree has not yet been updated to
the new version. Once a match’s family tree is successfully updated,
this function will become available. All customer trees should be
updated to the new version by November 30th.
The Linked Relationship feature was
designed to identify matches who
triangulate with known third cousins or closer, and then dump them all
together again in paternal and maternal buckets.
If the new family tree system is similar to the old, then you may
want to have two Family Finder
kits, e.g. a kit based on swabs sent to FTDNA with a full pedigree
chart for paternal/maternal phasing; and a kit based on an autosomal
transfer from another laboratory
with a minimal pedigree chart for
identifying triangulated
matches (e.g. my B95575).
AncestryDNA
AncestryDNA refuses to provide any way of identifying triangulated
matches.
MyHeritage
MyHeritage shared match lists include a
symbol identifying
which of these matches are triangulated.
There appears to be no way to filter the list of shared
matches to show only the triangulated matches.
GEDmatch
To use the triangulation tools, you are expected to subscribe
to Tier 1 (USD10 for one month).
If you login in one browser tab, then you
can open
these Tier 1 tools in another tab:
- the Multi Kit Analysis menu allows you
to search for triangulations involving Kit 1 and any two
or more of Kits 2,3,4,...
- the Triangulation
tool allows you to find all triangulation groups for the selected kit
at the selected thresholds
- the Segment Search tool allows more
flexible manual investigation of phasing and triangulation
Y-DNA
and surname projects
- Y-DNA projects can be
- Once you have your initial Y-DNA
results (or a known male-line relative's Y-DNA results), you can join
appropriate haplogroup projects.
- Some older project member and project administrator
features have been
disabled because of numerous changes prompted by GDPR fears:
- You must Opt in to Sharing on the PROJECT PREFERENCES page or your
pseudonymized DNA results and ancestor information will be missing from
the public results pages.
- You can also choose from that page whether to give each
project
administrator Minimum, Limited or Advanced access to your kit; reducing
access to Minimum pretty much eliminates all the benefits of project
membership.
- It is also recommended that you set Y-DNA Match Levels to
All Levels on
the PRIVACY & SHARING page.
- If there is no surname project for your surname and you are
happy to
deal with the spam risk, then you can apply to
set up your own project by following a simple five-step application process (which
actually consists of only four steps!).
- Every project has an activity feed for discussions
between
members and administrators, which can be used by administrators to
avoid having to answer the same frequently asked questions repeatedly
via individual e-mails.
- Project administrators have valuable tools, including:
- a subgroup editor to arrange members
on the Y-DNA results
pages
- subgroups are sorted alphabetically on the results
pages, so bear this in mind when choosing names
- criteria for grouping can include:
- surname
- geography
- haplotree position, whether
- confirmed by FTDNA
- predicted by FTDNA
- predicted by project administrator
- desire to see STR differences highlighted
- Many of the colours available for distinguishing
subgroups don't work
very well, or at least my eyesight isn't good enough to use them, as
the background colours are too close to the text colour.
- Subgroup Names (which are visible on the results pages)
appear to be
truncated at 161 characters, without warning. So keep these names as
short as possible with no unnecessary spacing or punctuation.
- Subgroup Descriptions (which are visible to the project
administrator(s) only) appear to be truncated at 973 characters,
without warning, and despite the false assurance of scroll bars in the
editor.
- a Y-DNA genetic distance calculator:
- this has greater thresholds than the matching
algorithm:
7/37 instead of
4/37; 25/67 instead of 7/67 and 40/111 instead of 10/111
- examples: R-M222
for a man with one Y-DNA37
match with no SNP test; R-FGC29367
for a man with no
Y-DNA111 match.
- a public website editor to publish information under any
or all of the following headings:
- Background
- Goals
- News
- Updates
- Bulletin
- Results
- Code of Conduct
- FAQ
- Project members can be recruited in many ways:
- FTDNA will send an e-mail on behalf of an
administrator, no
more than once every six months, to all customers with the relevant
surname who have opted to receive such e-mails.
- Administrators can see project members' matches and can
e-mail them directly to invite them to join.
- A clan or surname organisation or one-name-study is ideally positioned to run
online and offline recruitment drives.
- See here for all the technical details
of how and why to
upload your DNA data and pedigree charts to the various websites.