With the release of ASReml 4.2, has the best just got better?

With the release of ASReml 4.2, has the best just got better?

Amanda Avelar de Oliveira

21 July 2021
image_blog

Linear mixed models are broadly used to analyse different types of data produced by a wide variety of sectors, e.g., animal, plant and aqua breeding, agriculture, environmental and medical sciences. These models provide a rich and flexible tool to model the complex variance-covariance structure in the data, allowing for different sources of variation.

ASReml is a comprehensive and powerful statistical software package specially designed for linear mixed model analysis that is used across all fields of scientific research around the world. It has a strong theoretical and statistical background providing reliable estimation and inference. ASReml fits linear mixed models using Residual Maximum Likelihood (REML). Its REML routine produces parameter estimates that are consistent and efficient. ASReml uses the Average Information (AI) algorithm and sparse matrix methods, which enables it to rapidly solve large numbers of mixed model equations. 

The scientists’ choice for analysis of linear mixed models

ASReml has become the default software for analysis of linear mixed models by scientists like me. It has been cited thousands of times in scientific publications and is widely used by many industries for their commercial operations. Data analysts choose ASReml because it is faster and computationally more efficient than other software packages for solving mixed model equations. Its flexible syntax, and the wide range of variance model structures offered, enable ASReml users to analyse complex data, such as multi-environment and multi-trait data, and to accommodate pedigree and molecular information easily. It also allows for the analysis of large and messy data sets and makes fitting complex linear mixed models possible and easy. ASReml provides theoretically advanced approaches for the mixed model analysis of Normal and discrete response variables, as well as analysis of correlated data, repeated measures, multivariate analyses, and complex experimental designs with balanced or unbalanced datasets. Additionally, ASReml’s user-friendly interface makes it easy to run analyses and yield effective results quickly.

More data requires more processing capability

The amount of data generated by scientists nowadays is much larger compared to decades ago and statistical software needs to evolve to keep up with this increase. New technologies have substantially changed the way data are being generated. Data is the fuel that drives many important decisions in our society. Our lives can be transformed by having the right tools and software to analyse data and extract its full potential. The new ASReml 4.2 release brings much more power to the analysis of large datasets, and other improved features to help its users. The most significant changes are an increase in available memory, up from 32 to 96 Gigabytes workspace, and a reorganization of some core routines which enable ASReml 4.2 to run substantially faster. The memory increase allows users to analyse larger problems in less time. On multi-user systems, memory efficiency is maximised by allowing each user to specify the amount of memory needed for their current session (the maximum allocated depends on the machine availability). Regarding the reorganization of core routines, there are 10 areas where ASReml 4.2 is faster than the previous 4.1 version released in 2015. For instance, in jobs with relatively dense G matrices, computation times are often reduced by more than 40%.

Multi-thread parallel processing for faster results

ASReml 4.2 has been optimized in several areas for multi-threaded analysis processing, making it possible to use the maximum processing power available. Multi-thread processing is a mechanism to accomplish higher performance by partitioning and processing the data in multiple threads simultaneously. Users can specify the maximum number of threads to be used; by default ASReml uses all threads available, up to 16. This parallel processing delivers some significant gains in speed and on occasions can be up to 70x faster. These gains depend upon a variety of factors including microprocessor, machine power, data set size and type of analysis run, among others.

New analytical procedures

Beyond the optimization in terms of memory and speed, some new procedures have been implemented in ASReml 4.2. Pedigree qualifiers have been extended to allow the removal of unnecessary individuals. Pedigree trimming has been implemented along with the option of absorbing parents without data. This pedigree pre-processing removes unnecessary individuals from a pedigree, speeding up likelihood evaluation while maintaining proper relationships among the core members, saving computation time. Another new procedure is the extension of the bivariate analysis of generalized linear models. Previously in the 4.1 version, ASReml allowed a bivariate analysis of a binomially distributed variate and a normally distributed variate with an identity link. ASReml 4.2 has extended this bivariate analysis and both variates can now be distributed with Normal, Binomial, Poisson, Gamma or Negative Binomial distributions.

Yes – the best is now better

So, in answer to my question, yes, I believe the best did just get better! You can decide for yourself. For further details of the new features, improvements and updates in this release, users can check the VSNi ASReml Knowledge Base.

About the author

Amanda Avelar de Oliveira is an Agronomist with M.Sc and Ph.D. in Genetics and Plant Breeding from the University of São Paulo (ESALQ/USP). She has experience on quantitative genetics, genomic prediction, field trial analysis and genotyping pipelines. Currently, she works as a consultant at VSN International, UK.

“I believe in the power of knowledge sharing and multidisciplinary efforts to increase genetic gains in plant breeding while ensuring sustainability in agriculture”.