How I use the Shared Clustering Tool

The other day, in the Facebook user group for the Shared Clustering Tool created by Jonathan Brecher, I saw a post about how different folks use the tool.  I mentioned capturing MRCA information and aligning it to the clusters, but thought I would expound here in a blog post.

Before I begin, I’m making one basic assumption for this post — that you’ve already started playing with the Shared Clustering tool yourself. 

First of all, I only use it for Ancestry matches at this time, primarily because that’s where I have the most matches (ditto for my mom and my dad) and because Ancestry currently doesn’t provide segment information.

Secondly, although the tool offers the option of downloading match data directly from Ancestry, I do not use that feature.  Instead, I use the match and ICW (“In Common With”) files downloaded from Ancestry via DNAGedcom.com, which is, frankly, my go-to tool. 

DNAGedCom’s CSV files are my go-to files because I’m most comfortable using Excel – one of the reasons I like Shared Clustering, actually – and because that’s how I started, and I’ve kept on.   (Long story short, had I begun by using Ancestry’s Notes feature more effectively than I did, I could save myself some time, but I do it all in my DNAGedCom match file, and then update each subsequent download using VLOOKUP.)

An example of tracking on my mom’s Ancestry DNA match list (via DNAGedCom) is shown below:

Mom_DNAGedCom_MRCA

Color-coded by known MRCA.  If I’m not certain of the MRCA, based on the clustering, I add comments like “Copple kin” or “Hill?”

I upload the MRCA information to the completed Shared Clustering file via VLOOKUP since Jonathan has so nicely included the Test ID in the tool.  Usually, I will take the time to color-code the MRCA data in the Shared Clustering result file, simply so I can zoom out and easily see which cluster “belongs” to which possible MRCA.

Below you can see where I’ve zoomed out to see a fairly large clustering of my matches.  I’ve zoomed out to 10% and have highlighted 284 matches.  Per Jonathan Brecher’s Wiki, the red color indicates likely shared DNA.  The gray color indicates that, although the two matches (one in the row and one in the column) do not share DNA with each other, they likely share with a third person.  You can also see (barely) my color-coded MRCA notes on the left side of the image.

Cathy_SharedMatches_1

So, let’s zoom in a bit on this large cluster.

Below, notice that I have highlighted in green (as indicated by the yellow arrows) one of my closest matches (although she is a 4th cousin 1 removed).  She and I share a common ancestral couple:  Jacob Copple and Margaret Blalock, my 4th great-grandparents.  We also share 3 segments of DNA, and two of those segments are indicated here, in the cluster of red at the top left, and the cluster of red (circled in yellow).  Note the vertical line of red that merges into a vertical line of green — the red is showing me that she and I share DNA with the bulk of the two circled groups.

Cathy_SharedMatches_4

What does this tell us?  First it indicates two different segments of DNA, so if we go far enough back in time, it would be 2 different ancestors.   Second, she and I likely share those 2 segments of DNA.  Third, all the associated gray indicates a link between these 2 segments of DNA, so these matches are all most likely related to me via one ancestor and upstream of that ancestor.

Let’s zoom in even further and look more closely, now at my MRCA/clustering information I’ve imported from DNAGedCom.  The blue labels refer to matches who are Blalock/Blaylock descendants.  The gray labels reference a known match on Chromosome 9.

Cathy_SharedMatches_2

This would seem to point at the connection being on a segment of chromosome 9 and also relating to Margaret (Blalock) Copple.  This does not mean these matches share Margaret (Blalock) Copple as an ancestor with me, but rather one of Margaret’s own ancestors.

To put it another way, I have a clue!  These shared cluster results would seem to indicate that I need to do more research on Margaret (Blalock) Copple’s line, and connect with the matches who are Blalock descendants. And, at other DNA vendors, I should connect with matches who share the same segment on chromosome 9 to find out how or if they might be connected to a Blalock/Blaylock ancestor.

SharedClustering_Chr9

Let’s look at the second cluster, below.  This zoomed-in, partial view show matches who potentially share a segment on chromosome 13 with me.  Based on their Ancestry tree information, there are some who share Jacob Copple and Margaret Blalock as common ancestors with me (just one shown here).

SharedClustering_Chr13

Other matches in this cluster have no Copple or Blalock at all in their tree.  Their trees could be incomplete or incorrect, of course (as could mine!)  OR, their trees could be indicating a shared ancestor further “upstream” (meaning, a possible ancestor of Margaret (Blalock) Copple.  To that end, I’ve noted where there are Hemphill and Hungate ancestors in my matches’ trees.

These Hemphill and Hungate families, according to the Ancestry trees of my matches, hailed from Kentucky (where Margaret Blalock was born ca. 1810) and a branch of the Hungate family ended up in Washington County, Indiana in the 1810’s – 1830’s.  This is the same county Margaret lived in during the same time frame.  Although not definitive, it’s worth noting as a potential clue.

In summary, because the two groups are related (as indicated by all the gray associated with them), both DNA segments the groupings indicate are more likely to have been inherited by me from Margaret (Blalock) Copple (and, ultimately, her ancestors) rather than from her husband Jacob Copple.

Here’s another example of a cluster on my Copple line, where you can quickly see, from the teal color on the left-hand side, that these matches share an MRCA.  In fact, I use the teal to indicate more than one generation of Copple ancestors (all also ancestors of Jacob Copple who married Margaret Blalock).

SharedClustering_COPPLE

The last example is a line from my dad’s side.  As with the Copple and Blalock lines from my mother’s side, this paternal line is rooted in the United States from at least 1800 if not decades before that. 

The bulk of these DNA matches share my third great-grandparents, Anderson Lamburth and Ermine Farley (or Farnham).  However, they are clearly grouped in two clusters, so that one set may share Lamburth DNA and another set Farley DNA, or “upstream” (as in Anderson’s mother and Anderson’s father, or Ermine’s two parents).

Most intriguing is the linking between the two clusters.  Not just the general gray, but the vertical red lines indicated by the blue arrow.  I need to look more closely at these two matches — their names will be in the column headers (not shown here for privacy reasons). 

One, they likely share 2 DNA segments with me.  Two, they clearly share DNA with the small cluster on the upper left, as well as the larger cluster on the lower right.  AND the folks in the middle who are only indirectly related (indicated by gray) to the two obvious clusters.

SharedClustering_LAMBURTH

One other item to note in this cluster.  Some of the MRCAs are not highlighted in yellow.  That’s legit; referenced is the granddaughter of Anderson & Ermine, Mary (Lamburth) Dempsey, who was my great-grandma and her husband William. Clearly, the segment shared here relates to Mary rather than William.

If you use the Shared Clustering tool to visualize your Ancestry DNA matches, do you use any visualization aids to assign clusters to ancestors?  Perhaps you make better use of the Notes field than I do?

 

Cite/link to this post: Cathy M. Dempsey, “How I use the Shared Clustering Tool,Genes and Roots, posted 21 Oct 2019 (https://genesandroots.com : accessed (date)).

 

 

23andMe Ethnicity Update

If you’ve tested at 23andMe, have you checked out your ethnicity results lately? 

In a recent post[1], Judy Russell mentioned 23andMe’s latest ethnicity update, which somehow I missed completely!

Naturally, I had to go check it out, fearing a bit that my ethnicity percentages might be “messed up”.  Even though I know they are estimates, 23andMe has for some time had the percentages closest to what would be expected by my family narrative.  My dad is “all Irish”; my mom is “half Italian” due to her father being from Italy.  Et cetera, et cetera.

23andme_ethnicity

Very little has changed in my ethnicity percentages.   Here, I’ve noted in an Excel spreadsheet my former ethnicities per 23andMe (as of November 2018) and my current ones as of today when I reviewed the changes.

What is interesting, though, is that they seems to have taken a page from Ancestry’s “genetic communities” playbook, and zeroed in on specific areas in Ireland, Britain and Italy where my ancestors possibly lived in the past 200 years.

Let’s take a look.  We’ll start with Ireland.  On my paper trail, both my dad’s parents have Irish roots.  My paternal grandfather’s family left Ireland, depending on the branch of his tree, around the time of the Famine and shortly after – say, the 1850 to 1865 range.  My great-great grandfather, Patrick Dempsey, reportedly came from Kings County (now Co. Offaly) – per his obituary.  I don’t have more details than that.  His wife Hanora Hurley (or is it Hanora Riordan) – whom he married in the U.S. — may have come from anywhere in southern Ireland.  Best guess is Co. Cork or Co. Limerick.  On my grandfather’s maternal line, her father’s Lamburth ancestors likely came from England, while her mother Eliza (Landrigan) Lamburth came from the town of Garryrickin, Windgap Parish, Co. Kilkenny.[2]

My paternal grandmother’s father came from Athea, Co. Limerick, as did his father, while his mother came from Cooraclare, Co. Clare.  My grandmother’s mother came from Athea, Co. Limerick, as did her father, with her mother coming from Beale, Co. Kerry.[3]

In sum, my Irish heritage on my Nana’s side is from the province of Munster, specifically southwest of Ireland, around the River Shannon, while my Grandpa’s Irish heritage is from the province of Leinster, specifically Co. Kilkenny and Co. Offaly.

And 23andMe’s ethnicity determination – for the moment at least – largely agrees.[4]

23andme_irishethnicity

County Kerry, County Clare, County Limerick and County Kilkenny are all in the top 10.

As far as Great Britain/the U.K. is concerned, I have no idea where my ancestors came from.  My paternal grandfather’s Lamburth line, here in the U.S. since at least 1800, likely came from England but none of us researching this line have yet “crossed the pond”.  My mother’s maternal grandmother’s Wright line has been here in the U.S. since at least 1730 or so; researchers on this line have not yet crossed the pond either.  Here is what 23andMe estimates[5]:

23andme_ukethnicityPerhaps these areas could be clues, but it would be silly to jump ahead of myself and start researching Wrights and Lamburth/Lamberts over in England without knowing more about the family here in the U.S. in the 18th century.  The references to Scotland surprise me a bit, but could be related to the Gaelic / Celtic heritage of my Irish side.

With respect to Italy, my grandfather’s parents came from the province of Marche.  My great-grandfather was from Fano, and my great-grandmother was from Sant’Elpidio a Mare[6].  Some of us in my family have even gone to Marche and met our living cousins – that’s a story for another blog post.

Here is what 23andMe estimates[7]

23andme_marche_ancestryPretty wild, huh?  Marche!!  Still have to take it with a grain of salt – my brother’s estimated places of origin in Italy are completely different from mine – but still, right now, today, it “fits”.

 

 

[1] Judy G. Russell, “And still not soup…,” The Legal Genealogist, posted 27 Jan 2019 (https://www.legalgenealogist.com/blog : accessed 28 Jan 2019).

[2] For sources, see cathymd, “Dempsey Family Tree“, Ancestry.com (https://www.ancestry.com/family-tree/tree/17377380/family : accessed 26 Dec 2018).

[3] Ibid.

[4] 23andMe, Inc., “Cathy, your DNA suggests that 56.8% of your ancestry is British & Irish”, 23andMe.com (https://you.23andme.com/reports/ancestry_composition_hd/british_irish/ : accessed 29 Jan 2019).

[5] 23andMe, Inc., “Cathy, your DNA suggests that 56.8% of your ancestry is British & Irish”, 23andMe.com (https://you.23andme.com/reports/ancestry_composition_hd/british_irish/ : accessed 29 Jan 2019).

[6] For sources, See cathymd, “Serafini_Diamantini1“ tree, Ancestry.com (https://www.ancestry.com/family-tree/tree/19505554/family : accessed 29 Jan 2019).

[7] 23andMe, Inc., “Cathy, your DNA suggests that 12.6% of your ancestry is Italian”, 23andMe.com (https://you.23andme.com/reports/ancestry_composition_hd/italian/ : accessed 29 Jan 2019).

Pictures Really ARE worth a thousand words (or more!)…

I’ve been struggling to make sense of — or, more accurately, wisely use — my dad’s matches at Ancestry to extend some of his lines.  Dad has one great-grandparent who was born in the U.S.; the others were all born in Ireland (where all but three remained throughout their lives.)  So, I’ve long thought most of my dad’s matches are not easily assignable to one of his great-grandparents because there is much I don’t know about the aunts/uncles/first cousins of those ancestors.

Now, that may still be the case to some degree, but I did have an eye-opener when I used the NodeXL template with Excel to cluster my matches.  NodeXL is a template for graphing your networks (often in reference to social media)  — see here. I found about the tool from reading Shelley Crawford’s blog Twigs of Yore; she has an entire step-by-step series on how to create visual networks of your Ancestry DNA matches using NodeXL and Excel. (An indexed version is here.)

So, I downloaded my dad’s matches at year-end from Ancestry using DNAGedCom, and loaded the data into the NodeXL template.  I limited the number of matches to those who share at least 17 cM with my dad; I also did not include my brother or me as matches, nor my paternal 1st cousin.

The reason you want to exclude close matches is  because they will match so many people you (or your target person) that there will be connections all over the graph, and you won’t be able to discern any useful information.

For this same reason, I also excluded children and grandchildren of matches, for those cases I know about.  (As a disclaimer, just to be clear, with Ancestry’s matches, I have no way of telling if match A and match B are, say, child/parent to each other — unless I personally know A and B, or unless I’ve “met” online regarding our shared matches, and they’ve shared that with me.)

That’s the context; here is the first picture of Dad’s top 1,000 (or so) matches clustered into the top groups.

dad_ancestrymatch_clustering_majorgroups

The bigger dots represent the closest genetic connections to my dad.   Big dots exist in the navy dot group (upper left), the turquoise group (lower left) and the kelly green group (upper right).

The grey lines denote connections, both within groups and between groups.  In one easy glance, one can determine that the group most tightly related to each other is the group on the top row with dark green dots.  It looks like a web.

As far as inter-group connections go, the turquoise dot group seems to have the most connections with other groups.

So, when I highlight the turquoise group, what do I find?  Connections to most every group of matches my dad has — except for the navy blue group.   Which is kinda cool — but so what?  Unless you know something about the matches within the group.

dad_ancestry_match_lamburthcluster_all her lines

So, the matches in the highlighted group above are all kin to my dad’s great-grandfather, Archibald Lamburth (born c. 1833 Tennessee – died 1909 San Francisco).  He has the distinction of being my dad’s only great-grandparent born in the United States.  Given that the bulk of Ancestry’s DNA customers are U.S.-born, and that many with colonial ancestry say they have many thousands of matches, I suspect most of these connections will tie back to 18th-century U.S. and the colonies should I ever break this “brick wall”.

My second surprise was looking at the navy blue group.  Other than the one outlier I have yet to explore, all the matches are intra-group matches.  This group includes known close relatives of my dad’s maternal side.

dad_ancestry_match_nanacluster_all her lines

My dad has matches to his maternal grandfather‘s side (and his parents AND grandparents), as well as to his maternal grandmother‘s side (and her parents), the clustering algorithm does not distinguish between the two lines — at least based on the current population of matches used.

I may need to do a separate analysis on these particular matches — perhaps bringing down the filter to 15 cM — to see if I can break out that group into Maternal Grandfather and Maternal Grandmother.

Right now, the only useful information is that my dad’s mother’s matches and my dad’s father’s matches are separate.  They weren’t related to each other, based on the information we currently have — the above graphs, plus the genealogy I’ve already done.

The next picture, below, shows how some close genetic relatives (> 275 cM shared, in this case 1st cousins 1 generation removed), share matches with other groups.  This cluster could be a Dempsey cluster, with ties to Lamburth kin.  Which makes sense in my family tree since a Dempsey married a Lamburth.

dad_ancestry_match_bartjones billydodge cluster_their daughters_dempseylamburthlandriganhurley

Notice also that the group is somewhat open, like a child’s scribble.  Not everyone within the group is closely connected to everyone else in the group.

An example of a tightly-connected group is below. This is the group with dots in chartreuse green. Right now, I have no idea how they fit into the family tree.  It’s pretty much a self-contained group, with minor ties to the Lamburth (dad’s paternal grandmother’s side) group, but nothing significant.   Yet.

dad_ancestrymatch_clustering_group8

That was a look at my dad’s clustered Ancestry matches; sometime in the near future, I’ll take a look at my mom’s clustered Ancestry matches using the NodeXL tool.

 

How Complete is my Tree?

Are you sure that the segment of DNA you share with your DNA match is due to your common 3rd great-grandparents Joe and Sally (Harper) Booth (that’s a fictional couple, by the way) — and not due to a common ancestor you may not yet have found?  How complete is your tree? 

Recently, Blaine Bettinger posted in Facebook’s Genetic Genealogy Tips and  Techniques group, about the completeness of your genealogical tree being critical to accuracy in ascertaining the correct common ancestor with your DNA matches.  He referenced a post by Amberly Beck (see here) in which she discusses the completeness (or lack thereof) of her maternal line.

Rather than looking at just my maternal line or just my paternal line, or even just looking at my whole tree at once, I decided to review my results by grandparent. 

how complete is my tree

I “found” 9 ancestors last year on my maternal side without using DNA at all! Instead, I used DanishFamilySearch.com, a site which has been transcribing Danish census records, and allowing registered users to post their family tree information on their site, and the newly online Danish census records (in Danish, of  course!) at familysearch.org 

So, yay!, that was success for my grandmother’s line.  I now know 4 more of my maternal grandmother’s 16 2nd great-grandparents, and 5 more of my grandmother’s 32 3rd-great grandparents.

As for my 3 other grandparents, there was no change in the past year.  Not surprising, because I spent time on the BU Certificate course for 15 weeks (during which I spent little time on my own genealogy), and I also spent some time continuing to validate with DNA matches my Copple line (which is also on my maternal grandmother’s tree). 

Meaning, as I build out collateral relative trees for my Copple ancestors and find I have — more accurately, my mother has — DNA matches with descendants of those collateral relatives (siblings and 1st cousins of my own ancestors), that is slowly strengthening the case that the DNA shared belongs to the Copple line and not some other unknown line.  (Well, until I am able to build further back; the shared DNA may actually relate to, say, the wife of my most distant Copple ancestor, and not to him.)

I’ve done nothing really on my maternal grandfather’s line — I know the Italian town he came from and his grandparents’ names.  I also know I would likely find records on their parents via the local Catholic church.  As it would likely require assistance from a researcher over in Italy or a trip to Italy myself, it just has not been a priority for me.  Perhaps someday.

Like my maternal grandfather, my paternal grandmother was a first-generation American.  Her uncle was Con Colbert who was executed for his role in the Easter Uprising in 1916.  Consequently, he is somewhat famous in the Republic of Ireland; therefore, some of his family history was researched by a professional genealogist for the centenary in 2016.  So, I have a bit more on her kin than on my maternal grandfather’s kin.  I’ve also been fortunate two years ago to find some of the baptismal and marriage records for her maternal line ancestors (also Irish) online — one such place is here.

I have a “brick wall” at my great-great grandfather Patrick Dempsey.  Per his obituary, he was “of King’s County”.  That’s now County Offaly, but that doesn’t mean he was born and baptized there.  It may just mean he was from there last before coming to America circa 1850 or so.  There are about a half-dozen potential Patrick Dempseys baptised in Co. Offaly when he was thought to be born (ca. 1830), but I have no oral history as to his family.   Maybe his parents and siblings died in the Famine?

This year, I’d like to find out more about my paternal grandfather’s maternal grandparents: Anderson and Ermine (Farnham? Farley?) Lamburth — Grandpa Dempsey’s one line that has reportedly been in the U.S. since at least 1800.  Of course, I’d also like to break the brick walls of my 3rd great-grandmother (aka my maternal grandmother’s great-grandma) Phoebe Harvey — or her mother-in-law Margaret (Blalock) Copple.  We’ll see.

How about you?  Do you have a particular line you’re thinking of researching next?