Dr. Miguel Gallach21 September 2022
Inbreeding is one of the most important concepts in genetics. It is so important that even people unfamiliar with the subject have a reasonable idea of what inbreeding is about. An intuitive definition of inbreeding would be ‘something connected to the idea of mating closely related organisms’. Certainly, inbreeding refers to mating parents who share one or more ancestors, or reciprocally, inbreeding refers to descending from related parents.
Geneticists use an objective and measurable definition when talking about inbreeding: the inbreeding coefficient. The inbreeding coefficient, symbolized as F, is the probability that two alleles are identical by descent (i.e., they are biochemical copies of a single allele in an ancestor). Homozygotes of identical genes are called identical homozygotes or autozygous. For instance, for a polymorphic nucleotide site segregating A and G alleles in a population (i.e. a genetic marker), AA and GG individuals would be identified as homozygous and AG as heterozygous at this particular site.
The F coefficient can be calculated for an individual, two individuals or a population. For instance, the F coefficient of the offspring of first cousin parents is 0.0625. This means that, on average, 6.25% of the genetic material in offspring of first cousins will be identical by descent. However, there is a high degree of stochastic variance about this average. This is because meiosis is a random process and the proportion of DNA inherited from grandparents (or more distant ancestors) varies among grandchildren. Thus, the standard deviation of the previous F coefficient is 0.0243. In practice, this means that 95% of the offspring of first cousins will have a realized (actual) inbreeding coefficient between 0.0139 and 0.1111 (F ± 2 x SD). It is therefore perfectly possible for offspring of second cousins to be more autozygous (e.g., F = 0.03125) than the offspring of first cousins.
The stochastic variance increases with each meiosis and therefore the difference between expected and realized autozygosity increases with each new generation. This phenomenon is inevitable and may have a significant impact on breeding programs, the management of endangered or local livestock populations, conservation biology and human genetic studies. This is because breeding values, effective population size, population relationships, demographic history and identification of recessive disease variants depends on the accurate estimation of the inbreeding coefficient.
Runs of Homozygosity (ROH) are long DNA stretches of homozygous markers. Since the first population genomic studies characterizing ROH, and thanks to the inexpensive genotyping by means of SNP DNA Chips, ROH have been applied in demography studies, inbreeding depression, conservation biology, human health, and evolution. Quite possibly, the most significant application in breeding and conservation is to compute realized autozygosity. Since ROH are (most likely) a consequence of autozygosity, we can estimate a ROH-derived inbreeding coefficient, as = , where is the length of the identified ROH and is the length of the autosomal genome size (normally, the genome portion covered by the SNPs on the chip).
Although genotyping errors may have some influence when computing , errors are also very common in pedigree information, and there are studies strongly suggesting that is a better estimator for individual autozygosity than (i.e., F coefficient derived from pedigree information). In other words, are good estimators of inbreeding and even better than if the genomic data is good enough. As you can imagine, using genomic markers has gained a lot of attention in animal and plant breeding in recent years.
has other interesting applications in population genetics and breeding. For instance, in some cases, determining the shared ancestry of a reference population is difficult when there is no genealogy information; it is complex or incomplete. Since the length of ROH correlates with the number of recombination events, we can date the common ancestor back by number of generations. Hence, with a rule of thumb of 1cM/Mb for livestock species and assuming constant population size in the past, you would expect ROH expanding for 16Mb, 10Mb and 5Mb to come from a common ancestor three, five and ten generations back, respectively. Other applications are the estimation of the effective population size from the change in per generation () and the detection of genomic regions that underwent artificial selection (ROH hotspots) or regions that contain critical genes (ROH cold spots). For a review of general applications of ROH, you can refer to https://doi.org/10.1016/j.livsci.2014.05.034 and for a specific applications in small endangered populations, see https://doi.org/10.3389/fgene.2015.00173 and the citations therein.
Since the advent of next generation sequencing and derived genomic technologies, the impact of genomic sciences in animal and plant breeding (agrogenomics), medicine (personalized medicine, pharmacogenomics), biofertilizers (metagenomics), wildlife management (ecological genomics), etc., is indisputable. Here, is just one simple example of how genomics can provide a ‘possible solution to an old problem’ (https://doi.org/10.1016/j.livsci.2014.05.034). The proper integration of genomic data into your standard analytical toolkit will certainly help you reach your goals at a faster pace and leverage your profits.
Dr. Miguel Gallach is a geneticist with an M.Sc. in molecular and evolutionary genetics and Ph.D. in biology from the University of Valencia, Spain. Dr. Gallach specializes in the application of genomics and has over 15 years’ experience in academia (research, teaching, and mentoring). He is the former associate editor of BMC Evolutionary Biology and a former consultant for the IAEA/UN in Vienna, Austria. Currently (2022) he works as the CEO and Chief Scientific Officer of GC Genomics. https://gcgenomics.com/
Dr. Ruth Butler15 November 2023
Experimental Design: What are Balance, Efficiency and Independence?
Dr. Salvador A. Gezan08 November 2023
Statistical inference is the process of drawing some conclusions about a population based on the sample data at hand.
Salvador A. Gezan01 November 2023
An important aim when fitting linear mixed models (LMM) is the use of random effect estimates. In some analyses, such as genetic evaluations, the main objective of the analysis is to obtain these estimates.