Finding Long-lost Cousins Using DNA
4:30 p.m. Wednesday 22 March 2023
WWW version:
Recording:
TBA
Warning
No eating, drinking, smoking, chewing gum or washing teeth for
one hour before swabbing or spitting, in order to ensure an
uncontaminated DNA sample.
Outline
Components
of DNA
The basic science
- What is DNA?
- short for deoxyribonucleic acid
- contained in every cell in the body
- made up of chromosomes and mitochondria,
each consisting of strings of molecules of four nucleotides named
adenine (A), cytosine (C), guanine (G) and thymine (T)
- represented by strings of the letters A, C, G and T
- DNA labs turn saliva or cheek swabs into data files of As,
Cs, Gs and Ts:
RSID CHROMOSOME POSITION RESULT
rs4477212 1 72017 AA
rs3094315 1 742429 --
rs3131972 1 742584 GG
rs12562034 1 758311 --
rs12124819 1 766409 --
rs11240777 1 788822 AG
rs6681049 1 789870 --
rs4970383 1 828418 CC
rs4475691 1 836671 CC
rs7537756 1 844113 AA
- these data files can be copied to other DNA
comparison websites
- Where does our DNA come from?
- DNA is inherited by every child from its parents,
- largely deterministically, but with
- enough random variation over hundreds of thousands of
years to result in wide variations in the DNA signatures
observed today.
- When a sperm fertilises an egg, each brings DNA, which is
replicated in every cell of the resulting person.
|
male offspring |
female offspring |
sperm |
Y chromosome |
X chromosome |
paternal autosomal chromosomes 1-22 |
egg |
X chromosome |
maternal autosomal chromosomes 1-22 |
mitochondria |
Inheritance paths
See pedigree chart.
- Sex chromosomes
- Everyone has two sex chromosomes: males XY, females XX.
- Y chromosome
- Only males have a Y chromosome.
Y-DNA comes down the patrilineal line - from father, father's
father, father's father's father, etc.
This is the same inheritance path as followed by surnames,
grants of arms, peerages, etc., so Y-DNA is used for surname
projects.
- X chromosome
- Males have one X chromosome, females have two.
X-DNA may come through any ancestral path that does not contain
two consecutive males.
Blaine Bettinger's colour-coded blank fan-style pedigree
charts show the ancestors from whom men and women can potentially inherit X-DNA.
- Autosomes
- Short for autosomal chromosomes
- Exactly 50% of autosomal DNA (atDNA) comes from the father and
exactly 50% comes from the mother.
Due to recombination, on
average 25% comes from each grandparent, on average 12.5% comes from
each greatgrandparent, and so on.
By the time we get to 256 GGGGGGgrandparents, there will be some
people who are genealogical ancestors but not genetic ancestors.
- In extreme cases, an individual can inherit up to 35% from one
paternal grandparent and, hence, as little as 15% from the other
paternal grandparent.
Siblings each inherit 50% of their parents' autosomal DNA, but
not the same 50% (except for identical twins).
Similarly, siblings each inherit 50% of their mother's X-DNA,
but not the same 50% (except for identical twins).
Sisters each inherit 100% of their father's X-DNA.
Hence, autosomal DNA is used to produce estimated ethnicity
percentages.
- Mitochondria
- Everyone has mitochondrial DNA (mtDNA).
- Mitochondrial DNA comes down the matrilineal line - from
mother, mother's mother, mother's mother's mother, etc.
The surname typically changes with every generation in this
line.
The following table summarises these critical distinctions:
DNA component |
Inheritance path |
Inherited by |
Y chromosome |
From father only (and only if male) |
males only |
autosomal chromosomes |
Equally from both parents |
everyone |
X chromosome(s) |
Unequally from both parents |
males x1, females x2 |
mitochondrial DNA |
From mother only |
everyone |
- Autosomal DNA has in recent years become the cheapest
component to analyse and hence the most widely used in
genealogy.
- Y-DNA analysis remains more expensive, but has been widely
used for one name studies or surname projects for much longer,
and has also seen rapid recent scientific advances.
- Targeted mitochondrial DNA and X-DNA comparisons can be used
to solve more specialised problems (e.g. a tale of two X-chromosomes).
Mutations
Mutations are the first type
of random variation in the inheritance process (of all four
components of DNA), and are like transcription errors.
- Some locations mutate very frequently (every couple of
generations), and can be used to identify individuals beyond
reasonable doubt, e.g. in criminal cases.
- Some locations mutate less frequently (only once in many
generations or once in the history of mankind), and can be used
to identify closely or distantly related individuals.
- Most locations never mutate and are the same for all humans
(and for many other primates).
Special types of mutations:
- Single Nucleotide Polymorphism (SNP): a single location where
two (or occasionally more than two) different letters are
observed in different individuals
- for example, a single A in the parent may be replaced by a C
in the child (and in all of the child's future
descendants, until the next mutation at the same location)
- powerful tool for Y-DNA comparisons
- also the basis of autosomal DNA comparisons
- Short Tandem Repeat (STR): a string of letters consisting of
the same short substring repeated several times
- for example, CCTGCCTGCCTGCCTGCCTGCCTGCCTG is CCTG repeated
seven times; it may be repeated less or more often in other
individuals
- used in early Y-DNA comparisons
Y-DNA Mutations
- The Y chromosome is passed virtually unchanged from father to
son.
- Early Y-DNA analysis looked at the numbers of repeats for each
of 12 STR markers on the Y chromosome.
- Now Surname Projects
look at 111 STRs.
- The state-of-the-art Big Y-700 product reports values for 700+
STRs and for hundreds of thousands of SNPs
- Genetic distance
between two men is, roughly speaking, the number of differences
between their STR signatures, e.g. 0/12 or 9/111.
- The smaller the
genetic distance between two men, the closer
their expected relationship.
- The correlation between genetic distance and degree of
relationship is not perfect:
- When an STR mutation occurs, the subject of the mutation is
genetic distance 1/37 (say) from his father;
- until the next STR mutation occurs, probably several
generations later, all descendants of the subject of the
mutation are genetic distance 0/37 from their most recent
common ancestor.
- Some SNP mutations on the Y chromosome are
once-in-the-history-of-mankind events and can be used to build a
Y-DNA Haplotree of 50,000+ branches, which
has evolved from the original Simplified
Tree of Y-Chromosome Haplogroups of 24 branches
- The top level haplogroups A-R are about 20,000 years old.
- Surname-specific SNPs are now being
discovered.
Autosomal DNA Mutations
- The DNA companies observe only 0.02% or so of the locations on
the autosomes.
- These are the locations known to have SNP mutations.
Recombination
Recombination is the second
type of random variation in the inheritance process (of autosomal
DNA and maternal X-DNA only) and is how, e.g., the father's paternal
and maternal autosomes cross over
to produce the child's paternal autosomes.
- Every sperm and egg is potentially unique.
- Recombination of the paternal and maternal
chromosomes is sometimes compared to shuffling two decks of
playing cards.
- Recombination rates vary markedly along the autosomes and the
X chromosome.
- The local recombination rate is referred to as genetic length and is
measured in units called centiMorgans, which estimate the
average number of generations to the common ancestor of two
individuals sharing an identical long run of letters.
- One recombination per generation is expected in each 100 centiMorgans (cM, not cm).
- The greater the
genetic length of the
DNA shared by two
individuals, the closer the expected relationship.
- Y-DNA: greater genetic distance between men, more
distant relationship
- atDNA: greater genetic length shared by people,
closer relationship
- The correlation between shared centiMorgans and degree of
relationship is not perfect.
- Shared cM varies from around 3600cM for identical twin and
parent/child relationships to around 8cM for very distant
cousins to 0cM for unrelated individuals.
- See Shared cM Project chart and tool
(try, e.g., 209.5cM) and TheDNAgeek chart
DNA comparison
- DNA comparison will
- match you with the most closely related individuals already
in the DNA database; and
- estimate a range of possible relationships to those on your
match lists; but
- leave you to establish your precise relationship using
supplementary evidence from traditional sources.
- Caveats
- DNA has let the genie out of the bottle as regards secret
adoptions and fosterings.
- DNA comparison is being used to identify birth parents and
their children who were given up for adoption.
- Those involved in adoptions have sometimes conflicting
rights to information and rights to privacy.
- If you want family secrets to remain family secrets, then
you need to keep your immediate family members out of DNA
comparison databases - including the future children and
grandchildren of any unborn children that you desert
- Treat family secrets and those directly involved as
sensitively and sympathetically as possible.
- Professional help is available.
- Insurance companies would like (or competition may force
them) to use:
- knowledge of your good genes to make your health insurance
cheaper and your pension more expensive; and
- knowledge of your bad genes to make your health insurance
more expensive and your pension cheaper.
- Law enforcement authorities would like to use the same
techniques as are used to solve adoption cases in order to
identify DNA from serious crime scenes and from unidentified
remains.
- Collaborating with law enforcement in some jurisdictions may
result in a distant relative being subjected to the death
penalty.
- Anonymising your DNA will impede your genealogical
discoveries.
Y-DNA and surnames
Y-DNA follows the same inheritance path as is typically followed by
surnames.
In principle, your match list should contain dozens of men with your
exact surname.
In practice, there are many reasons why this may not be the case:
- your surname (or your male line beyond the adoption of your
surname) may not be one of those which have
proliferated due to many men of the surname (or male line) each
having
several sons;
- your surname (or male line) may be in danger of being
"daughtered out",
due to
many men of the surname (or male line) not marrying or fathering
only
daughters;
- there may be no other man of your surname in the FTDNA
database (e.g. no Geheran in February 2018);
- there may be only a few people of your surname in the FTDNA
database (e.g. there were only 11
Dungans of either gender in February 2018);
- there may have been no concerted effort to recruit men of
your surname to the FTDNA database;
- there may have been a concerted effort to recruit men of
some genetically related surname to the FTDNA database;
- the men of your surname in the FTDNA database may not yet
have ordered any Y-STR product;
- there may have been an above average number of STR
mutations in
your male line in recent generations, resulting in few matches
of any
surname;
- there may have been a below average number of STR mutations
in
your male line since the adoption of surnames, resulting in many
matches with men whose common ancestry predates the use of
surnames;
- there may have been an overt or covert surname/DNA
switch
in your male line since the adoption of surnames;
- your surname may have multiple independent genetic origins;
- etc., etc.
A surname/DNA switch is
defined
as the use of a surname different from that used by the genetic
father, which may be:
- a surname inherited from someone else; or
- a surname translated in some way from the genetic father's
surname.
Surname/DNA switches are just one cause of surnames having multiple
Y-DNA signatures. Many
surnames, particularly occupational surnames and surnames in
countries
which
have had many immigrants speaking one or more foreign languages,
have
multiple
independent
genetic origins for more mundane reasons.
Conversely, two different surnames can have the same Y-DNA signature
if the common ancestor lived before the adoption of surnames, about
a millenium ago.
Among the myriad of, possibly
one-off,
circumstances causing surname/DNA switches
are:
- adoption (including of foundlings), with the suname
inherited from the adoptive father;
- foundlings given a surname based on where or by whom they
were found;
- infidelity, with the surname inherited from the mother's
husband;
- use of sperm donors, with the surname inherited
from the man who raised the child;
- parents giving every second child the father's surname and
the mother's surname, with some inheriting the surname from
the mother;
- men using their mother's surname for other reasons;
- men using their stepfather's surname;
- a change of surname
associated
with inheritance of a family estate, with the surname
inherited from the benefactor;
- men going on
the run
for all sorts of reasons and changing their surname to avoid
being
traced, whether running away from the law, from political
opponents, or
from their families, perhaps even wishing to commit
bigamy;
- men using their maternal grandmother's maiden surname (see example below);
- translation of a surname
back and forth
between languages in different ways (see Sir Robert Edwin
Matheson's Varieties
and synonymes of surnames and Christian names in Ireland:
for the
guidance of registration officers and the public in
searching the
indexes of births, deaths, and marriages);
- inconsistent standardisation of the spelling of a surname
once the computer age put an end
to spelling diversity;
- reverting to an ancient
spelling of a surname discovered in the course of research;
- etc.
For example, Osman Wilfred Kemal's mother died in childbirth in 1909
and he was brought up by his maternal grandmother Margaret Hannah
Brun
née Johnson and became known as Wilfred Johnson. Wilfred Johnson's
grandson,
known as Boris Johnson, became Prime Minister of the United Kingdom
in
July 2019. Prime Minister Johnson has Kemal Y-DNA, but a surname
that
was not used by any of his eight greatgrandparents and that
descended
via a female GGgrandparent. See here.
FamilyTreeDNA.com hosts three types of DNA projects,
co-administered by people like me:
- geographically-based, e.g.:
- Clare
Roots (co-administator Terry Fitzgerald, WA, USA;
1863 members as of 23 January 2022)
- Kerry
Y-DNA
Project (administrator John Hallissey, Waterford;
214 members)
- surname-based, e.g.:
- Clancy
(administrator Fergus Clancy, Dublin; 128 members)
- Durkin/Durkan/Durcan
(administrator Mike Durkin, NY, USA; 35 members)
- Marrinan
(administrators Cory Marinan, WI, USA; Cindy Wood, MI,
USA; Greg Marrinan, CT, USA; 99 members)
- McNamara
(rescued from worldfamilies.net
and GDPR; 153 members)
- O'Dea/O'Day/Dee
(administrator James O Dea, Dublin; 146 members)
- haplogroup-based,
groups of men with similar Y-DNA but different surnames and
common male line ancestry pre-dating the surname era, e.g.:
- Group General Fund Contributions can be
used solely to pay for analysis by FTDNA of the DNA of new or
existing members.
- Surname projects at FTDNA complement traditional surname
studies like those affiliated to
- Once you have your initial Y-DNA results (or a known male-line
relative's Y-DNA results), you can join appropriate haplogroup
projects.
- Some older project member and project administrator features
have been disabled because of numerous changes prompted by GDPR
fears:
- You must Opt in to Sharing on the PROJECT PREFERENCES page or your
pseudonymized DNA results and ancestor information will be
missing from the public results pages.
- You can also choose from that page whether to give each
project administrator Minimum, Limited or Advanced access to
your kit; reducing access to Minimum pretty much eliminates
all the benefits of project membership.
- It is also recommended that you set Y-DNA Match Levels to
All Levels on the PRIVACY & SHARING page.
- Project members can be recruited in many ways:
- FTDNA will send an e-mail on behalf of an administrator,
no more than once every six months, to all customers with
the relevant surname who have opted to receive such e-mails.
- Administrators can see project members' matches and can
e-mail them directly to invite them to join.
- A clan or surname organisation or
one-name-study is ideally positioned to run online and
offline recruitment drives.
Big Y-700 and
cheaper shortcuts
-
I recommend those
interested in the history of their surnames and in finding
male-line matches
to persuade a male bearer of the surname to swab for
https://www.familytreedna.com/
and to order the current state-of-the-art Big Y-700
analysis.
-
Unfortunately, the
price of analysing one Y-chromosome has stubbornly remained
far higher than the price of analysing 22 pairs of autosomal
chromosomes, fluctuating between USD379 and USD449 plus
shipping for over three years now.
-
In association
with the recent Rootstech conference, the lower price is
available for the entire month of March by entering the
relevant promo code from
-
There are further
discounts for those who previously bought the now outdated
Y-STR analysis and wish to upgrade.
-
I have some FTDNA
swab kits here today which will avoid shipping costs.
-
As administrator or
co-administrator of the Clancy, Durkan, Marrinan, McNamara and
O'Dea surname projects and the Clare Roots project, I will
organise a further USD50 discount from project general funds
for men with any of these surnames or variant spellings who
order or upgrade to Big Y-700, on a first-come first-served
basis.
While awaiting your results, start typing up your family tree,
using, for example:
Once you have your Big Y-700 results:
If USD379 (EUR351 at today's exchange rate) is beyond your
budget:
- invite family members to contribute to the cost:
- females, because they don't have a Y-chromosome; and
- other males with your surname, because their Y-chromosomes
should be virtually identical to yours;
- start with Y-37 for USD79 and upgrade later; or
- start saving and order on USA Father's Day (18 Jun 2023) or
Black Friday (24 Nov 2023) or Cyber Monday (27 Nov 2023) when
similar prices will be available; or
- start with autosomal DNA analysis.
Autosomal
DNA and cousins on all sides
- I recommend thos interested in finding matches on all sides of
their family to spit for:
- AncestryDNA
(USD99+shipping); and/or
- 23andMe
(EUR79.20+shipping for bulk orders)
(neither of which accepts raw DNA data uploads).
- Then download the raw DNA data files and upload to the other
DNA comparison websites.
Fishing in all the
gene pools
It you want to identify your long-lost cousins, to help them to find
you, and to identify your and their long-forgotten ancestors, then
you must link your DNA data and
your known ancestry in the form of a pedigree chart and share them on all the major
autosomal DNA comparison websites ("fish in all the gene pools").
- To add your information to the online DNA databases:
-
- Export a DNA data file from the DNA websites for which you
have spat or swabbed and upload it to all the DNA websites
that accept uploads.
- Export a GEDCOM file from wherever you store your family
tree, containing at least the ancestors of each DNA subject
and upload it to all the DNA websites so that matches can
see a pedigree chart.
- Examples of pedigree charts: from Ancestral
Quest, AncestryDNA, ancestry.com,
old FamilyTreeDNA, new FamilyTreeDNA and GEDmatch.
- For FamilyTreeDNA.com, include in the GEDCOM file any
paternal or maternal relatives already matching you at FTDNA
and the shared ancestors; FTDNA will use these linked
relationships to assign other matches to the DNA subject's
paternal and maternal sides.
Further reading