Dr. Giovanni Galli12 January 2024
Automating analytical workflows is a common procedure that allow us to let the computer do the hard work. When that is the case, we must prepare our system to be robust enough so it can anticipate and handle errors that would break the pipeline, causing delays or even reporting undetected incorrect outcomes. This is particularly important when dealing with large or many datasets, complex and unbalanced statistical procedures, or when there are too many repeated tasks to be performed (e.g., many statistical models to be tested).
Let’s dig a bit on this topic within R by having a look at a simple example showing how to handle errors and reporting why things fail. For this example we are going to use ASReml-R, but this principle is transferable to other code, packages, and even programming languages.
We will start by loading the ASReml-R library:
Let’s create some ASReml-R calls in a character object that we will evaluate later.
The first call is:
And the second call is:
Next, we will fit something that is a bit complex (spatial analysis with nugget) given the data at hand:
Now let’s create our own function (called
error_handling_asreml) that will: execute an ASReml call, check if the models have converged (update if not and stop after some tries), and finally return the fitted models or an error message. Note that error handling in R does not require a wrapping function, but it is a good way to keep things organized. The purpose of each part of the function is explained in the comments.
Let’s execute the
error_handling_asreml function in batch mode with
lapply and see what happens.
We can see that
call2 succeeded, and
call3 succeeded but needed additional updates to converge. Since we collected the error message we can have a look at what went wrong with
Note that the model did not run because the factor
Plots does not exist as the actual name is
Wplots. Having the error message stored can assist us in fixing the issue and this does not get lost if other messages are printed in the console.
We hope this gives you an idea and basic template on how to handle errors in R. There are many other ways of doing this, and you must adapt the code to your needs. Download here the full R file with the code.
About the author
Dr. Giovanni Galli is an Agronomist with an M.Sc. and Ph.D. in Genetics and Plant Breeding from the University of São Paulo (USP)/Luiz de Queiroz College of Agriculture (ESALQ). He currently works as a Statistical Consultant at VSN International, United Kingdom. Dr. Galli has experience in field trials, quantitative genetics, conventional and molecular breeding (genomic prediction and GWAS), machine learning, and high-throughput phenotyping.
Dr. Valérie Poupon09 February 2024
Parental versus Animal Model: What is the difference and how do we choose?
Tim Bean23 January 2024
Data, data everywhere…but is it helping your analytics?