Creating a Y DNA
Project
10:20 a.m. Saturday 4 May 2019
WWW version:
YouTube version:
TBA
Introduction
Y chromosome or Y-DNA comparison is the best way of either:
- estimating the
relationship between two men with the same or similar surnames; and/or
- finding new male line relatives or "Y-DNA matches".
What is Y-DNA?
- DNA is:
- made up of chromosomes and mitochondria, each consisting
of
molecules of four nucleotides
named adenine (A), cytosine (C), guanine (G) and
thymine
(T)
- represented by strings of the letters A, C, G and T
- The Y chromosome, like the surname, is passed virtually
unchanged from father to son, with just occasional mutations.
- There are 59,373,566 letters on the Y chromosome alone.
- Over tens of thousands of years, these occasional mutations
add up to give a wide distribution of different Y-DNA signatures today.
- Only males have a Y chromosome for comparison.
- The Y chromosome comes down the patrilineal line - from
father,
father's father, father's father's father, etc.
- This is the same inheritance path as followed by surnames,
grants of
arms, peerages, etc.
- A woman does not have a Y chromosome, so should find a
male
relative with the relevant surname to swab:
- a father, brother, nephew, cousin, etc., if her
interest
is in her maiden surname;
or
- a husband, son, brother-in-law, father-in-law, etc., if
she is married and her interest is in her married
surname.
- A single mother who has given her maiden surname to her son
has performed a surname/DNA switch; so her son's Y-DNA reflects his father's surname,
and not his own surname.
Two types of mutation can be found on the Y chromosome:
- A single-nucleotide polymorphism,
abbreviated SNP and pronounced snip,
is a single location where there is a
relatively high degree of variation between different people.
- For example, most people may have an A at one such
location, with a minority having a C.
- A short tandem repeat (STR) is a string
of
letters consisting of the
same short substring repeated several times, for example
CCTGCCTGCCTGCCTGCCTGCCTGCCTG is CCTG repeated seven times.
- The number of repeats may occasionally differ between
parent and child, due to mutations.
Y-DNA analysis began many years ago by looking at patterns of STR
values from the Y chromosome. These values mutate relatively frequently
in both directions.
- By far the most comprehensive Y-DNA service is provided by FamilyTreeDNA.com (FTDNA), but
there are
competitors like YFull.com
and YSEQ.
- FTDNA sells Y-DNA12, Y-DNA37, Y-DNA67, Y-DNA111, etc.,
which
compare the number of repeats in 12, 37, 67 or 111 STRs respectively.
- My Y-DNA111
matches as of 11 February 2019.
- My top
Y-DNA67 matches as of 11 February 2019.
- These
numbers are used to assign men to predicted
haplogroups.
- The word haplogroup
has long
been used to describe any group of men with similar Y-DNA; its meaning
has evolved with the science of analysing the Y chromosome.
- The numbers of STR repeats and predicted haplogroups can be
viewed in the Y-DNA results pages for various projects,
such as the Clare Roots project.
More recently, once-in-the-history-of-mankind SNPs began to be
identified:
- These mutations have occurred exactly once.
- Every man
descended from the man in whom the mutation originally occurred
inherits the mutation.
- No other man has the mutation.
- Men with the same
SNP mutation tend to also have similar patterns of STR mutations, so
STR mutations are used to predict SNP mutations.
It was eventually realised that these SNPs are a much better
way of categorising men
than the patterns of reversible STR mutations which were originally
used on their own.
The following questions may be of interest when comparing the Y
chromosomes of two men with similar surnames:
- did their most recent common male line ancestor use the
same surname, or a variant of it?
- did their most recent common male line ancestor live within
the surname era (post-1014), but use a different surname?
- approximately when did their most recent common male line
ancestor live?
- approximately when did a man in the relevant male line
first adopt the present surname?
- do they have a more recent common male line ancestor than
genetic Adam?
- The "biblical Adam" was the first and only male in the
world at the time of creation.
- The "genetic Adam" or "Y-Adam", the most recent common
patrilineal ancestor of all men alive today, was merely the only male
in the world in his day whose
male line descendants have not yet died
out.
- Y-Adam is estimated to have lived between
160,000 and 300,000 years ago.
- are the men being compared from the same ancient
haplogroup?
- are they from the same recent haplogroup?
- has there been a surname/DNA switch since their ancestors
first adopted surnames?
The science or art of placing men on the human family tree or Tree of Mankind,
often called the Y
haplotree, has evolved
rapidly in recent years, mostly thanks to Y-DNA projects and their
volunteer administrators.
Outline
Joining FTDNA
FamilyTreeDNA.com is the only
effective option for Y-DNA projects:
- If you have autosomal DNA data with another company, then
you
should join FTDNA via the free autosomal transfer facility.
- If you (or a deceased relative) have already sent cheek
swabs to FamilyTreeDNA for
Family Finder
or mitochondrial analysis, then they are held in storage and will be
re-used
for Y-DNA analysis.
- If you are completely new to genetic genealogy, then swab
kits
are available here today.
- If you want the best price, join one of the existing Surname & Geographical Projects
before placing your order.
- There may be no other man of your surname in the FTDNA
database (e.g. no Geheran in February 2019,
although there is a Palmer with Geheran DNA).
- There may be only a few people of your surname in the FTDNA
database (e.g. there were only 11
Dungans of either gender in February 2018, rising to 16 by
February 2019).
Joining
projects
Y-DNA projects can be
Once you have your initial Y-DNA
results, you can join appropriate haplogroup projects.
Most geography-based projects use some combination of Y-DNA, autosomal
DNA and mitochondrial DNA, e.g. various Irish projects.
By joining projects:
- you may be introduced to relatives who are just outside the
FTDNA matching thresholds (4/37, 7/67, 10/111)
- you will contribute to the advancement of knowledge about
your own surname, about related surnames, and about Y-DNA
This is why FTDNA effectively pays customers to join projects via
discounted prices for Y-DNA analysis.
Becoming
a co-administrator
- Good succession planning recommends that every project
should have at least two administrators or co-administrators.
- If you offer to help with one of your existing projects,
then the existing administrator(s) may train you in.
- The first prerequisite (thanks to GDPR) is to have an
e-mail address
which you are prepared to expose to spammers and to other non-FTDNA
customers:
- You may already receive more e-mails than you have time
to read or reply to.
- You may wish to establish a new e-mail address
specifically
for this purpose.
- Wikipedia defines a data
breach as "the intentional or unintentional release of secure
or private/confidential information to an untrusted environment".
- My long-standing guidelines
on e-mail etiquette
demand that my correspondents "please do not publish my e-mail address
on any web page, news group, chat room, etc."
- If you are an ordinary customer of FTDNA, only your
matches can see
your e-mail address.
- If you are an FTDNA project administrator, everyone on
the internet,
whether an FTDNA customer or not, sees your e-mail address.
- This is part of the FTDNA Terms & Policies.
Application
procedure
If there is no existing project for your surname or geographical area
of interest, then
start your own ...
If you are happy to
deal with the spam risk, then you can apply to
set up your own project by following a simple five-step application process (which
actually consists of only four steps!).
Project
objectives
Administrators will probably want to do some or all of the following:
- determine the branch (single genetic origin) or branches
(multiple genetic origins) of the
haplotree to which the surname belongs
- recruit project members:
- FTDNA will send an e-mail on behalf of an administrator,
no
more than once every six months, to all customers with the relevant
surname who have opted to receive such e-mails.
- Administrators can see project members' matches and can
e-mail those matches directly to invite them to join.
- Other online and offline recruitment drives are possible.
- There may already be a relevant surname organisation.
- Organisations like Clans
of Ireland or GOONS
may be willing to help.
- predict SNP mutations and recommend haplogroup projects:
- FTDNA uses STR mutations to predict SNP mutations.
- Most Irish men are very confidently predicted by FTDNA to
be R-M269+.
- FTDNA does not predict more recent SNPs.
- Project administrators can often confidently predict more
recent European SNPs like R-U106 or R-P312 or more recent Irish SNPs
like R-M222 (Niall of the Nine Hostages) or R-L226 (Dalcassian).
- overlay trees on each other:
- surname trees from the ancient annals
- SNP haplotree
- mutation history trees, combining SNPs and STRs
- genealogical trees
- maintain the Y-DNA Colorized Chart:
- Subgroups are sorted alphabetically on the results
pages, so bear this in mind when choosing names
- Criteria for grouping can include:
- surname
- geography
- haplotree position, whether
- confirmed by FTDNA
- predicted by FTDNA
- predicted by project administrator
- desire to see STR differences highlighted
- Many of the colours available for distinguishing
subgroups don't work
very well, or at least my eyesight isn't good enough to use them, as
the background colours are too close to the text colour.
- I use this
colour scheme in most of my projects:
- Each top-level haplogroup represented in the project
has
its own colour, so
far
comprising:
- E (Lime)
- G (Light Blue)
- I (Beige)
- J (Aqua Marine)
- Q (White)
- R (Light Grey)
- Within haplogroup R, there are also separate colours
for
sub-branches:
- R1a (Pink)
- the four main Irish sub-branches:
- R-M222 (Green Yellow) is North West Irish/Irish Type
I
and subclades
- R-CTS4466 (Plum) is South Irish/Irish Type II and
subclades
- R-L226 (Coral) is Dalcassian/Irish Type
III and subclades
- R-L362 (Orchid) is Munster Type I and subclades
- the rest of R-L21 (Yellow)
- R-U106 and subclades (Light Cyan)
- act as moderator of the activity feed
- advise members on purchasing upgrades:
- single SNP tests (USD39)
- additonal STRs
- SNP packs
- Big Y-700 (USD649)
- discover new surname-specific SNP mutations and get them
added to the haplotree
- advise members on using the FTDNA website and
other databases, tools and websites, e.g.:
Administration
tools
The must useful of the many tools on the GAP
2.0 Home Page include:
- the public website editor, which can
be used to publish information under any
or all of the following headings:
- Background
- Goals
- News
- Updates
- Bulletin
- Results
- Code of Conduct
- FAQ
- the Y-DNA genetic distance calculator:
- this has greater thresholds than the matching
algorithm:
7/37 instead of
4/37; 25/67 instead of 7/67 and 40/111 instead of 10/111
- examples: R-M222
for a man with one Y-DNA37
match with no SNP test; R-FGC29367
for a man with no
Y-DNA111 match.
- the subgroup editor to arrange members
on the Y-DNA results
pages
- Subgroup Names (which are visible on the results pages)
appear to be
truncated at 161 characters, without warning. So keep these names as
short as possible with no unnecessary spacing or punctuation.
- Subgroup Descriptions (which are visible to the project
administrator(s) only) appear to be truncated at 973 characters,
without warning, and despite the false assurance of scroll bars in the
editor.
Examples
- Mr Clancy
- 15 STR matches at genetic distance 3/37 and 4/37
- none of them is a Clancy
- only one has bought Big Y, and he is R-A933+/R-A934+
- Was the man who adopted the Clancy surname
R-A933+/R-A934+?
- three other members of the Clancy
project are R-A933+
- they are all more than 7/37 from the new Mr Clancy
- I recommended the single SNP test for R-A933
- it came back positive
- Are there Clancy-specific SNPs below R-A933 on the
haplotree?
- two project members did Big Y and are R-BY81149+ and
R-BY120940+
- the other project member did two single SNP tests and
is R-BY81149- and R-BY120940-
- so these appear to be Clancy-specific SNPs, or
mutations which occurred after adoption of the surname
- Are these two separate branches of the tree, descended
from the first Clancy?
- Or could there be two independent genetic origins of the
surname, both under R-A933?
- There are certainly several independent genetic
and geographic origins of Clancy:
- Leitrim
- north-west Clare (Corcomroe)
- south-east Clare (Tradaree)
- Mr Lynch
- Ordered Y-DNA37 on 2 Dec 2011.
- Kit back 25 Jan 2012.
- Processing completed 6 Mar 2012.
- First Y-DNA37 match appeared 17 Apr 2019 - another Lynch,
with roots in Moveen, County Clare.
- Was it worth the seven-year wait?!
- The first Moveen Lynch that I recruited to the Clare
Roots project
turned out to be the grandson of a (bigamous) Curry who did a
surname/DNA switch, from his father's surname to his mother's surname.
- I recently recruited his Lynch 3C1R to confirm this.
- This had a positive externality for Mr Lynch.
- A Google search enabled me to trace his lineage back to a
Lynch whose obituary confirmed that he migrated from County Clare to
Winona County, Minnesota.
- I e-mailed Mr Lynch and recruited him to the Clare Roots
project.
- He is 5/37 from a Lindsey in that project.
- Lindsey and Lynches clearly were both originally
Ó Loinsigh in the Irish language, but the ancestors of the DNA subjects
anglicised the name differently.
- Lindsey has already bought the top-of-the-range Big Y-700
product, which gives him a terminal SNP of DC269.
- Note that DC is short for Dalcassian, and all DCxxxx SNPs
are below R-L226 on the haplotree.
- I (or FamilyTreeDNA) should have already recognised the telltale
signs that the two Lynches are Dalcassians.
- These signs are four values in the first 37 markers, and
both Lynches have all of them:
- DYS 439 = 11 Yes
- DYS 459 = 8-9 Yes
- DYS 464 = 13-13-15-17
Yes
- DYS 456 = 15 Yes
- The DYS458 STR marker is highly volatile in the Lynch/Lindsey
family, accounting for their lack of matches and the large
genetic distance between them.
- For Lindsey its value is 17; for the two Lynches it is 20
and 22.
- I predict that the Lynches are also very likely to belong
to the R-DC269+ branch of the haplotree.
- An Ireland Reaching Out discussion
led me to another Lynch researcher with connections to Moveen and
Winona County who filled in more blanks in the tree.