Unreplicated trials: What can they really do? Part 1

Unreplicated trials: What can they really do? Part 1

Dr. Salvador A. Gezan

16 January 2023
image_blog

Unreplicated (UR) trials are field experiments in which only a small group of genotypes (often controls) are replicated several times, with the rest of the genotypes replicated only once. For example, consider a trial in which we have 264 genotypes where four of them have five replications (identified as check genotypes), and the other 260 have only a single replication (identified as test genotypes). 

Unreplicated designs, sometimes also called Augmented designs, are often used in early-generation trials, where a limited quantity of seed per genotype is available for field testing. The UR trial allows for an early evaluation of these test genotypes in order to select a subset of them to be subsequently assessed in a replicated trial. In the following paragraphs, I will describe some aspects that are critical to the use of these trials.

Making informed decisions with check genotypes

Given the lack of replication of most genotypes in UR trials, the great majority (if not all) of the information originates from the few replicated check genotypes. These are the genotypes that contribute to the estimation of background noise. Furthermore, as it is the check genotypes that enable the estimation of all other design effects (like block or row and column effects), they are also the genotypes that separate the genetic signal from the residual signal. Hence, it is the check genotypes that are being described by our heritability calculation. So, if you see a high (or low) heritability, then it is mostly due to these checks. You can imagine what happens if you have a single check genotype in your trial: all information on genetic variation will come from this genotype! 

For this reason, it is critical that UR trials have a reasonable number of checks (I suggest at least 5). These should be selected to be representative of the population of genotypes being tested; for example, using parents or other relatives. In addition, they need to have a reasonable level of replication so that they can describe the field heterogeneity. It is not uncommon to see 10-20% of the experimental units (plots) assigned to check genotypes.

The power of spatial analysis in UR trials

One way the test genotypes can get some additional ‘replication’ is by ‘borrowing’ bits of information from their neighbours. This is why spatial analysis is critical for a UR trial. A simple way is to use some random block (or row-column) effects that will model a correlation between units belonging to the same unit. Better still is modelling these with proper spatial analysis. These, will incorporate correlations across the rows and columns of the field trial between neighbouring genotypes. For example, a check plot that has a large positive residual is likely to be surrounded by positive residuals (if there is a strong positive spatial correlation). This allows for better separation of the genetic signal from the noise for the unreplicated test genotypes, and therefore, in turn increases (moderately) their precision. In turn, this will also help to improve the estimation of the genetic variation. But do not forget that spatial analyses are an add-on to proper design of experiments.

Role of pedigree and genomic relationships in a UR trial analysis

A pedigree-based or genomic-based relationship matrix that is incorporated into the model for UR trial data is going to better connect the genotypes (if these are fitted as random effects). For example, for a given test genotype, any relative (full-, half-sibs, cousin, etc.) will contribute to the estimation of its breeding value effect, and therefore, effectively increase its replication. Consequently any UR trial is going to be improved whenever these genetic relationships are incorporated, and in turn they will increase the precision of the estimated genetic effects.

Leveraging multi-environmental trials for robust analysis

Many of the aspects mentioned above apply to the statistical analysis of a single trial. However, often breeding programs establish a multi-environmental trial (MET) with the same genotypes over several sites or environments. Statistically, what is interesting about these MET trials is that each environment is unreplicated for the test genotypes; hence, these, say S environments, are effectively blocks! In this case, as expected, each ‘block’ has a copy of each genotype. In addition, as we have some check genotypes that have many replications, these will assist in assessing the background noise at each site. 

Following on from this thinking, we should analyse our MET as a randomized complete block design. However, in practice more complex models with genotype-by-environment interactions (GxE) are used. Again, statistically this is not ideal. The reason is that the estimation of the GxE effects originates mostly from the check lines, and, to a lesser extent, on the ‘borrowed’ information from the underlying spatial correlation. Therefore, any inference from the GxE effects, although not bad, is likely very limited and lacking the required statistical power. 

Nevertheless, as each genotype will sample each environment, the main effect of genotype is estimated with reasonable precision (specifically S replications). This allows us to rank and select the genotypes across all the environments with reasonable confidence. But note, this is for the genetic main effect - the interaction of GxE is just noise.

Maximizing potential while understanding limitations

UR trials are widely used and present important benefits for testing material early. If possible, UR trials should always be analysed by incorporating spatial components and genetic relationships, as doing so allows more information to be captured about the test genotypes despite their limited replication. However, UR trials are not the experimental design of choice for all of our testing needs. Their limitations have to be considered carefully. Ideally, a UR trial should be followed by a properly replicated trial on the few selected genotypes from the UR data before making our final decisions.

About the author

Dr. Salvador Gezan is a statistician/quantitative geneticist with more than 20 years’ experience in breeding, statistical analysis and genetic improvement consulting. He currently works as a Statistical Consultant at VSN International, UK. Dr. Gezan started his career at Rothamsted Research as a biometrician, where he worked with Genstat and ASReml statistical software. Over the last 15 years he has taught ASReml workshops for companies and university researchers around the world. 

Dr. Gezan has worked on agronomy, aquaculture, forestry, entomology, medical, biological modelling, and with many commercial breeding programs, applying traditional and molecular statistical tools. His research has led to more than 100 peer reviewed publications, and he is one of the co-authors of the textbook Statistical Methods in Biology: Design and Analysis of Experiments and Regression.