Over the last decade there has been significant growth in the commercialization of genealogical mapping. Dozens of new businesses now exist that enable consumers to trace their family history online by searching official documents, such as immigration and military records, birth and death certificates, and census data. These companies harness the power of virtual social networks and crowd sourcing to connect individuals with close and distant relatives to collaborate on building interconnected family trees with embedded historical data. Myheritage.com, an Israeli start-up and one of the leading genealogical websites, has over 75 million registered members using their website, with 1.5 billion people included in over 27 million family trees hosted on their site.5
Users build their family trees with information known about their relatives and ancestors and the website then automatically finds matching historical records, providing further information and embedding evidence into a family tree. The website finds matches between family trees, providing individuals the ability to effortlessly build their family tree with existing trees. The more relatives an individual enters into his or her family tree, the higher the probability that he or she will match with existing trees. Users should not be surprised to start a family tree on a genealogical website and find that someone has already included them or some of their relatives in an existing tree.6
Although there are several websites and applications that create family trees, there is a standard file type for saving the genealogical data that comprises a family tree known as GEDCOM (Genealogical Data Communication).7 This standardization enables the sharing and distribution of digital family trees without issues of non-compatibility. In practice, this has contributed to the spreading of digital family trees across various platforms, websites, and companies.
As more and more individuals input their family trees on genealogical and ancestry websites, it becomes increasingly possible to weave all the trees together in order to create a mega tree with each family tree serving as a building block or corner stone of a larger tree.
The ability of making a family tree, however, has been limited by the amount of existing knowledge a family possesses about itself. Families that have remained in one geographic location for several generations, and have maintained the same culture and spoken the same language, are far more likely to have a deeper knowledge about their family history than those that have migrated across continents, as was typical of Jewish families during the past century.
Even with the sharing of family trees as described above, without knowing the names of specific family members or whether, for example, a great grandfather had any siblings, it is almost impossible to find evidence of their existence. This is especially the case when researching ancestors who were first generation immigrants. The changing of family names to fit into a new country and the disconnection from siblings and cousins in the home country (before the advent of modern communications), were historically common place for first generation immigrants and it poses a challenge for their biological descendants researching their family as it existed in the old world.
However, over the last decade, advances in genetic research and computing technology have been closing the gap of what individuals know about their family history and relatives. Since 1990, when the international scientific community, with funding from the U.S. Department of Energy’s Office of Health and Environmental Research, began a 15-year project to map the human genome, every year the power and speed at which computers are able to sequence the constituent DNA increases as the cost decreases.8 In 2002 the cost of sequencing one million base of DNA (the human genome contains 3 billion base pairs) was around $5000, today the cost is around $0.06.9 This dramatic decrease in cost has enabled DNA sequencing to become accessible beyond mega-funded laboratories.
Although the main impetus and funding justification for DNA research was for medical purposes, new areas of scientific research are being explored utilizing DNA sequencing. One area of research that benefited tremendously from these developments is molecular anthropology, which uses DNA analysis to study evolution, human migration patterns, and genealogical relationships between human populations. By collecting and analyzing the portion of DNA that is inherited from only one parent, scientists have been able to classify maternal and paternal linages into haplogroups. Haplogroups are often described as ancient clans or tribes that may have lived within close geographical proximity at one point in history. It is more accurate, however, to think of haplogroups as groups of people who share one common direct maternal or paternal ancestor who lived sometime in the last tens of thousands of years.
The many academic and medical studies published utilizing DNA analysis for health, demographic, and population studies quickly generated commercial applications. With DNA testing becoming less expensive over the last decade, several new companies were established that offer direct-to-consumer DNA analysis for genealogical and medical purposes.10 Today, for less than $100 one can “discover your lineage, find relatives and more” by having your DNA analyzed.11
Consumers are able to find near and distant relatives through DNA matching, expanding their family tree and ancestral knowledge in ways unimaginable only 25 years ago. As a result of these products, individuals and families are learning more about their ancestors and origins. Genealogical DNA analysis is especially useful for individuals who have only a limited knowledge of their family history or are interested in their deep historical roots.
It needs to be understood that it is not the case that a DNA test analyzes a genome (the totality of DNA found in one cell) or genes (the functional sections of DNA) and then determines if that DNA is Jewish or not. There is no specific gene or genetic marker that that is proof positive that one is Jewish. Genealogical DNA tests compare the DNA of an individual with an existing database in order to find matching or very similar DNA sequences and then determine genealogical relationships.
A DNA test can be explained through an analogy of the game of telephone. In the game of telephone, an individual starts with a phrase and privately passes it on to one person, who in turn passes on to another. Once the phrase passes through everyone in the game, the starting phrase and the resultant phrase are compared for differences. Imagine a giant game of telephone with 63 people in which each person transmits the phrase to two people instead of only one. At the end of the game, instead of one resulting phrase, there would be 32 phrases and each phrase would have been passed 5 times. The resulting phrases would presumably share similarities, and perhaps some would be identical. The closer the phrases are on the chain, the more similarities the phrases would share.
A DNA sequence is like the message transmitted in the game of telephone. Except a DNA sequence is fantastically more complex. The human genome found in a DNA sequence contains more than 3 billion base pairs (those horizontal bars bridging the double strands of a DNA molecule). The human genome is 99.9% similar among all humans, the .1% that is different can be thought of as a genetic code or fingerprint. An individual’s genetic code is composed of a combination of half of each of their parent’s genetic code. For various reasons, DNA gets slightly mutated when it is transmitted and those mutations get hardwired into the DNA. The next time the DNA is transmitted from one generation to the another, the mutation might remain intact and could be transmitted from one generation to another. As long as that portion of the DNA sequence does not get altered or mutated again, it serves as a unique genealogical stamp, or “genetic marker” that one individual passes on to their descendants.
As in the game of telephone, as the DNA sequence is passed from generation to generation, it gets slightly changed and altered. The more times it is transmitted, the more it varies from its original form. Conversely, the more similarities two DNA sequences share, the closer they are likely to be along the chain of transmission. |
It should be noted, however, that the American Society of Human Genetics warns consumers that everyone has thousands of ancestors, segments of DNA get transferred in a “non-deterministic” manner, and only a “fraction” of one’s descendants can be traceable through DNA testing.12 That is to say, the length and portion of DNA that are transferred from parent to child are seemingly random. Not all our ancestors contribute equally to the make-up of our DNA; inevitably traces of some ancestors will be more dominate than others, and some ancestors may not be traceable at all. Without taking DNA samples directly from all of one’s ancestors, it is impossible to truly map one’s genealogy. The Society states that “the genomic segments contributed by a particular ancestor are far from all being uniquely identifiable, so even if one’s genome has those specific genome contributions, identification of particular ancestry is always uncertain and statistical.”13
Through complex statistical analysis, computers are able to predict how closely two samples of DNA are related. Without comparing a DNA sample to an existing dataset of DNA there is little information that can be generated for genealogical exploration. The larger the dataset of DNA samples, and the more biographical details known about the individuals whose DNA is in the database, such as where their ancestors lived, the more the test can reveal. A DNA test will only tell a test-taker that he/she shares some DNA with another person or a group of people alive today and previously sampled. Therefore, individuals seeking genealogical information about themselves through DNA testing will only be able to see how their DNA compares with others.14
For this reason, companies may include in their DNA database not only their specific customer’s DNA, but DNA samples taken during scientific studies. For example, Family Tree DNA includes the dataset collected by Doron Behar et al. for the article “Genome Wide Structure of the Jewish People,” published in Nature in 2010.15 The dataset includes DNA samples from 14 Jewish Diaspora communities and 69 non-Jewish “old world” population areas.16
As DNA tests inevitably become cheaper and more DNA samples are added to large databases and analyzed for demographic and genealogical purposes, the overall picture of how people around the world are related will become clearer.