Discussion of Newly-Created Matrix of Genetic Distance
By US 312 Steve Mullinax, August 2010
In this article, I introduce a tool for highlighting significant patterns of relationships
among participants in the Molyneux Surname DNA Project, which you can see on the
new IMFA web site. The figure at right is a matrix of genetic distance between
pairs of project participants. The purpose of the matrix is to make available to
the whole community of Mx family researchers an at-a-glance summary of the DNA test
results. It highlights relationships where traditional genealogical research for
a common ancestor might pay off. These will typically be pairs of participants with
a close genetic distance. On the other hand, pairs of participants with large genetic
distance between them are unlikely to share a common ancestor. See the Family Tree
DNA web page “Understanding Genetic Distance” for an explanation of
how they do distance computations. For now, let it suffice that small values of
genetic distance, zero through five, indicate a “close” genetic relationship,
and the smaller the distance, the closer the relationship. A close relationship
means that there is a high probability that a common ancestor exists within a certain
number of generations. Whether you actually find and document a common ancestor
depends on the diligence and quality of your conventional genealogical research.
The matrix is an experiment on my part. I need feedback on whether the idea is useful
to you, and how it could be made more useful. I would prefer to someday create a
“cladogram” from our results, but have so far not been successful. (The
Fitzpatrick DNA Study has used this genetic mapping technique to create a very inform-ative
ancestry diagram. See See Example) Anyone with insight on how we could create
a cladogram using our DNA results, please contact me, email@example.com.
To create the genetic distance matrix, I copied DNA test results from our
project site’s results matrix dated April 15, 2010, the latest compilation available
at this writing. I removed all participants with less than 37-marker results. This
enables an apples-to-apples comparison of high-resolution results. (It should also
be possible to create a matrix of 25-marker-plus results.) I created a macro in
Excel which compares each possible pair of participants and determines the genetic
distance between them. Genetic distance is defined as the sum of the differences
of each of 37 pairs of STR numbers for the pair of participants.
Description of the genetic distance matrix: First, be aware that the participants
are in the same order as on the web site’s results matrix. They are grouped by haplogroup.
(Y-chromosome haplogroups are the major divisions of male lines of descent, with
ties to human migration and settlement patterns originating several thousand years
ago. The haplogroups identified among our project participants so far include E,
I and R1b. For more information, see the article “Human Y-chromosome DNA haplogroup,”) The kit
numbers for participants are in the left-most column. They are repeated, in the
same order, across the top row. The second column gives the name and residence of
the participant’s earliest documented ancestor. The names are repeated in the second
row. Note that the line for the first participant (Kit 152327) is intentionally
dropped, as is the column for the last participant (Kit 102563). This removes a
row and a column which would show nothing other than each of these participants’
relationship with himself. All relationships for 102563 are shown in the last row
of the matrix, and all those for 152327 are shown in the first column.
There is a stair-step pattern descending from the upper left to the lower right
corner of the matrix. Cells on the lower left of that stair-step indicate a close
genetic distance if they contain a number zero through five. If they contain no
number, they indicate a more distant genetic relationship, and thus are not genealogically
significant. Cells to the upper right of the stair-step are not significant.
The cell at the intersection of one participant (kit #) in a row and another
participant in an intersecting column is the genetic distance between that pair
of participants. (E.g. the intersection of the row for kit # 63130 and the column
for kit# 152327 shows the number 2 indicating a distance of 2 between these
This matrix pinpoints the close relationships among participants for the surname
project as a whole. It also allows us to focus on close genetic relationships to
see if they are supported by traditional documentation, i.e., the paper trail shared
Let’s look at a few of the zero-distance pairs in the table below.
Common Ancestor identified
Adam Molyneux (UK014)
Greenbury W. Mullinix
Jonothan Mullinix Sr (US285s)
Greenbury W. Mullinix
Wayne Straight (US332) has preliminary research strongly suggesting a connection.
Jonothan Mullinix Sr (US285s)
Adam Molyneux (UK014)
After seeing a draft of this article, Wayne Straight researched whether kits 58548
and 45732 might be linked. Kit 58548’s pedigree traces back through Greenberry Mullinix
b. 1771, to Jonothan Mullinix Sr, b. 1705 (England). Wayne wrote, “It appears
that Jim Mx (US301) is one of what Marilyn Blanck calls the ‘Greenberry Mx's’, about
whom there's a lot of documentation. So it shouldn't be an insurmountable task to
do a genealogical comparison. This family is descended from Jonathan Mx1 of Elkridge,
MD, the same family as Don Mulinix (US236).” Wayne and Don collaborated on
an article documenting this family’s history and genealogy, which is posted on the
IMFA Wiki web site. Wayne also found that “according to several family
trees on Ancestry … this Greenberry [kit 45732’s ancestor] is the grandson
of the first Greenberry Mx, via Greenberry Mx1's son Elisha Mx..” Thus, from
Wayne’s research, there is strong preliminary evidence that kits 58548 and 45732
are linked. Additional research might solidify this link.
It would also be interesting to research whether these two families might be connected
to the descendants of Adam Molyneux, b ??; d 1726, St. Ebbs, Oxford (kit 46203),
with whom they share a zero genetic distance relationship.
These observations suggest possible actions for research:
- Research to extend the pedigrees.
- Addition of descendant trees to pedigrees.
You can also see that the matrix contains sixteen pairs with a genetic distance
of 1. I could create a similar table to the one above which would suggest more avenues
When we identify pairs which have close genetic distances and for which documentation
already exists, the genetic results add weight to the documentary proof. For example,
kits #48424 and N33141 have a common documented ancestor in “Levi Mullinax,
b c 1775, TN; d c Sep 1818, Wilson Co., TN”. The participants’ genetic
distance of 3 confirms a close connection, buttressing the documentation. Note that
these close genetic distances do not prove that the documented common ancestor is
correct. Only rigorous analysis of the whole of the genealogical evidence can do
Three triangular groupings are apparent in the matrix, consistent with our ordering
by haplogroup. E, I, R1b. Note that there are no close relationships crossing between
haplogroups, which is as we would expect, and gives some validation to the DNA results
and to the matrix.
Again, I would appreciate any comments on the usefulness of this approach and how
it could be improved. Steve Mullinax,
Thanks to Wayne Straight and Marie Spearman for their comments and contributions
to this article.