[ Member Login ]

Discussion of Newly-Created Matrix of Genetic Distance
By US 312 Steve Mullinax, August 2010

Click for larger image

In this article, I introduce a tool for highlighting significant patterns of relationships among participants in the Molyneux Surname DNA Project, which you can see on the new IMFA web site. The figure at right is a matrix of genetic distance between pairs of project participants. The purpose of the matrix is to make available to the whole community of Mx family researchers an at-a-glance summary of the DNA test results. It highlights relationships where traditional genealogical research for a common ancestor might pay off. These will typically be pairs of participants with a close genetic distance. On the other hand, pairs of participants with large genetic distance between them are unlikely to share a common ancestor. See the Family Tree DNA web page “Understanding Genetic Distance” for an explanation of how they do distance computations. For now, let it suffice that small values of genetic distance, zero through five, indicate a “close” genetic relationship, and the smaller the distance, the closer the relationship. A close relationship means that there is a high probability that a common ancestor exists within a certain number of generations. Whether you actually find and document a common ancestor depends on the diligence and quality of your conventional genealogical research.

The matrix is an experiment on my part. I need feedback on whether the idea is useful to you, and how it could be made more useful. I would prefer to someday create a “cladogram” from our results, but have so far not been successful. (The Fitzpatrick DNA Study has used this genetic mapping technique to create a very inform-ative ancestry diagram. See See Example) Anyone with insight on how we could create a cladogram using our DNA results, please contact me, steve.mullinax@comcast.net.

To create the genetic distance matrix, I copied DNA test results from our project site’s results matrix dated April 15, 2010, the latest compilation available at this writing. I removed all participants with less than 37-marker results. This enables an apples-to-apples comparison of high-resolution results. (It should also be possible to create a matrix of 25-marker-plus results.) I created a macro in Excel which compares each possible pair of participants and determines the genetic distance between them. Genetic distance is defined as the sum of the differences of each of 37 pairs of STR numbers for the pair of participants.

Description of the genetic distance matrix: First, be aware that the participants are in the same order as on the web site’s results matrix. They are grouped by haplogroup. (Y-chromosome haplogroups are the major divisions of male lines of descent, with ties to human migration and settlement patterns originating several thousand years ago. The haplogroups identified among our project participants so far include E, I and R1b. For more information, see the article “Human Y-chromosome DNA haplogroup,”) The kit numbers for participants are in the left-most column. They are repeated, in the same order, across the top row. The second column gives the name and residence of the participant’s earliest documented ancestor. The names are repeated in the second row. Note that the line for the first participant (Kit 152327) is intentionally dropped, as is the column for the last participant (Kit 102563). This removes a row and a column which would show nothing other than each of these participants’ relationship with himself. All relationships for 102563 are shown in the last row of the matrix, and all those for 152327 are shown in the first column.

There is a stair-step pattern descending from the upper left to the lower right corner of the matrix. Cells on the lower left of that stair-step indicate a close genetic distance if they contain a number zero through five. If they contain no number, they indicate a more distant genetic relationship, and thus are not genealogically significant. Cells to the upper right of the stair-step are not significant.

The cell at the intersection of one participant (kit #) in a row and another participant in an intersecting column is the genetic distance between that pair of participants. (E.g. the intersection of the row for kit # 63130 and the column for kit# 152327 shows the number 2 indicating a distance of 2 between these two participants.)

This matrix pinpoints the close relationships among participants for the surname project as a whole. It also allows us to focus on close genetic relationships to see if they are supported by traditional documentation, i.e., the paper trail shared with IMFA.

Let’s look at a few of the zero-distance pairs in the table below.

Kit# Pedigree? Kit# Pedigree Common Ancestor identified
46203 Adam Molyneux (UK014) 45732 Greenbury W. Mullinix None
58548 Jonothan Mullinix Sr (US285s) 45732 Greenbury W. Mullinix Wayne Straight (US332) has preliminary research strongly suggesting a connection. (See below.)
58548 Jonothan Mullinix Sr (US285s) 46203 Adam Molyneux (UK014) None

After seeing a draft of this article, Wayne Straight researched whether kits 58548 and 45732 might be linked. Kit 58548’s pedigree traces back through Greenberry Mullinix b. 1771, to Jonothan Mullinix Sr, b. 1705 (England). Wayne wrote, “It appears that Jim Mx (US301) is one of what Marilyn Blanck calls the ‘Greenberry Mx's’, about whom there's a lot of documentation. So it shouldn't be an insurmountable task to do a genealogical comparison. This family is descended from Jonathan Mx1 of Elkridge, MD, the same family as Don Mulinix (US236).” Wayne and Don collaborated on an article documenting this family’s history and genealogy, which is posted on the IMFA Wiki web site. Wayne also found that “according to several family trees on Ancestry … this Greenberry [kit 45732’s ancestor] is the grandson of the first Greenberry Mx, via Greenberry Mx1's son Elisha Mx..” Thus, from Wayne’s research, there is strong preliminary evidence that kits 58548 and 45732 are linked. Additional research might solidify this link.

It would also be interesting to research whether these two families might be connected to the descendants of Adam Molyneux, b ??; d 1726, St. Ebbs, Oxford (kit 46203), with whom they share a zero genetic distance relationship.

These observations suggest possible actions for research:

  • Research to extend the pedigrees.
  • Addition of descendant trees to pedigrees.

You can also see that the matrix contains sixteen pairs with a genetic distance of 1. I could create a similar table to the one above which would suggest more avenues for research.

When we identify pairs which have close genetic distances and for which documentation already exists, the genetic results add weight to the documentary proof. For example, kits #48424 and N33141 have a common documented ancestor in “Levi Mullinax, b c 1775, TN; d c Sep 1818, Wilson Co., TN”. The participants’ genetic distance of 3 confirms a close connection, buttressing the documentation. Note that these close genetic distances do not prove that the documented common ancestor is correct. Only rigorous analysis of the whole of the genealogical evidence can do that.

Three triangular groupings are apparent in the matrix, consistent with our ordering by haplogroup. E, I, R1b. Note that there are no close relationships crossing between haplogroups, which is as we would expect, and gives some validation to the DNA results and to the matrix.

Again, I would appreciate any comments on the usefulness of this approach and how it could be improved. Steve Mullinax, , 503.768.9065.

Thanks to Wayne Straight and Marie Spearman for their comments and contributions to this article.

Latest Genetic Distance Charts

Click for larger image
April 2011 - Haplogroups E & I

Click for larger image
April 2011 - Haplogroup R

Click for larger image
September 2010

Click for larger image
August 2010