Introduction: Why code testing?

Testing is an important part of the coding process. If you have ever written a function in R, then you already have some experience in testing code; checking to see whether or not the function ‘works’ is a type of informal test and part of the normal process of coding. But just because a function worked initially, does not necessarily mean that it will continue to work when something changes. An error might arise if you try (perhaps unknowingly) to use your function in a different way than originally intended (e.g., with a slightly different input type). And if you have written a lot of code for your model or data analysis, you will likely have several functions working together, making it difficult to pinpoint the cause of the error. By writing unit tests – which independently check small parts of your code to see that they work as intended – you can often quickly identify (or rule out) where the problem lies.

Writing tests as you code can also make coding more efficient and more fun. They encourage you code in manageable chunks and verify that each function you write is working as intended. This helps both with the planning and organisation of large data analyses and modelling projects. In R, it is also very satisfying and reassuring watching the code that you have written pass a long list of well-constructed unit tests. Here we will demonstrate how to use the testthat R package in R, which is an easy way to get started with code testing. After reading through these notes, you should be able to write unit tests for your own code.

The testthat R package

What is the testthat package?

The testthat R package is a helpful way to get started writing tests for the code that you write. Testing code is often boring and tedious, and it is usually very tempting to bypass this stage of the coding process entirely. Once you have code that works (often obtained only after a lot of hard work), you want to use it, not think about all of the ways that it could eventually fail. The testthat R package tries to make testing as easy and interesting as possible, so that you will actually want to do it after you have finished coding something. It does this by minimising the amount of effort that goes into writing code tests and maximising how cool it is to watch all of your tests being passed. For example, here is what testing looks like for a reasonably sized R package.

All of the 143 tests click through one by one to make sure that all of the functions in the package are working as intended. How does testthat determine that the functions are working as intended? Each individual test runs a function with a known input and output. The input is read into the argument’s functions, and the output is examined by the test against some criteria, which could be some particular value, object type, or object size. Seeing this in practice below will make it clearer, but the general idea is that after writing the code, you know what the output should be for a particular input. As you continue to code more and regularly test, if that output changes or returns an error, then the automated process of testing with testthat will catch the problem and tell you what needs to be fixed.

How do I get started with testthat?

The first thing that you need to do is install the testthat R package. This can be done from CRAN using the code below.

install.packages("testthat")

Alternatively, you can download testthat from GitHub if you have the devtools R package installed.

devtools::install_github("r-lib/testthat");

Once testthat is installed, you can load it into Rstudio just like any other R package.

library(testthat);

You are now ready to start writing R scripts to test the functions that you have written. First, however, we need to add some functions. I will use two simple temperature conversion functions written in a previous coding club session on writing R packages.

Two simple functions that will be tested

When you code, it is often a good idea to try to write short functions. There is no hard rule for determining when a function is too long, but whenever possible, it is usually best to avoid writing functions with too many lines to view all at once on your monitor. Small functions that do one job, which can be called by other functions as necessary, make coding easier later on because this allows you to compartmentalise specific tasks and better see how everything fits together at once (here is a good example of this in R code; here is a bad example; and here is a very bad example; all of these work, but the first is much nicer to read and test). Small functions are also easier to test because it is clearer what output you are expecting for a particular function input. Many small functions (rather than a few large ones) also allow you to run more tests and better pinpoint the problem when something goes wrong.

Take the a simple R script with two custom R functions. The first function converts a temperature from Fahrenheit to Celsius.

F_to_C <- function(F_temp){
    C_temp <- (F_temp - 32) * 5/9;
    return(C_temp);
}

The second function converts from Celsius to Fahrenheit.

C_to_F <- function(C_temp){
    F_temp <- (C_temp * 9/5) + 32;
    return(F_temp);
}

It might not seem like there is much reason for testing these functions. They are both tiny, really only a single line of code doing the work. The maths can be checked by eye, and the function can likewise be checked easily by running a known temperature through each function and looking at the output (which you would naturally do after first writing the functions to double-check that they work). Nevertheless, if these functions were embedded within a much larger R script with other functions to process a large data set and run one or more analyses, an automated test to ensure that each is working as intended is probably worth writing. A test could of course check to make sure that the converted value returned is correct for some arbitrary temperature input, but you also might want to make sure that the output is being returned in a particular format; e.g., as a numeric – or specifically as a value of type ‘double’ rather than an integer. For more complex functions, you might want to ensure that the dimensions of the output are correct (e.g., a matrix with a specific number of rows and columns, or a list with 3 elements of different types).

The test_that function

We can use the test_that function of the testthat package to write sime simple tests for the F_to_C and C_to_F functions written above. The test_that function follows a simple structure with two arguments: desc provides a description (in quotes) of whatever is being tested, and code provides whatever code (within curly brackets) to test.

Within the bracketed code to be tested, one or more expect_that functions are run; each call to expect_that is an independent test. The expect_that function takes two arguments: object is the object that is being examined by the function and condition is the condition that the object is meant to satisfy.

Below is a test of the F_to_C function. Note the description in quotes and the bracketed code. Within the test, the F_to_C function is run for the temperature of 50 degrees Fahrenheit and stored as temp_C. Since we know that the correct conversion from 50 degrees Fahrenheit to Celsius is 10, we can run an expect_that to check that temp_C equals 10.

test_that(desc = "Fahrenheit to Celsius", code = {
  
  temp_C <- F_to_C(50); # Runs the function
  
  # Test that the result is the correct value
  expect_that( object = temp_C, condition = equals(10) );
  
  # Test that the result is numeric
  expect_that( object = is.numeric(temp_C), condition = equals(TRUE) );
})

We can do the same below to test the the C_to_F function in the opposite direction. We now that 10 degrees Celsius is equal to 50 degrees Fahrenheit, so we can set the test accordingly.

test_that(desc = "Celsius to Fahrenheit", code = {
  
  temp_F <- C_to_F(10);
  
  # Test that the result is the correct value
  expect_that( object = temp_F, condition = equals(50) );
  
  # Test that the result is numeric
  expect_that( object = is.numeric(temp_F), condition = equals(TRUE) );
})

Setting up the tests above was the boring part. The fun part is running them all at once. You could also run them one by one, but this is not nearly as satisfying. If you were run the above tests for F_to_C or C_to_F in the R console, nothing would print off on your screen; this means that the test is passed. The test will only print something off if it fails, such as with the test below, which deliberately tests the incorrect conversion of Fahrenheit to Celsius.

test_that(desc = "Fahrenheit to Celsius wrong", code = {
  temp_F <- F_to_C(50);
  expect_that( object = temp_F, condition = equals(2) );
})

To run a full list of tests and return a report, we need to save the above tests in an R script.

An R script for running testthat functions

Test functions can be saved like any other R script file (with a .R extension), but with one caveat. A test R script should start with the prefix ‘test-’. A good way of doing this is to add the prefix to the name of file that stores the functions to be tested. In other words, if you are testing the functions in the file temp_conversion.R, then you can save the tests in the file test-temp_conversion.R. Here is what the test-temp_conversion.R file looks like.

library(testthat);
context("Temperature function testing");
source("temp_conversion.R");

test_that("Fahrenheit to Celsius", {
  
  temp_C <- F_to_C(50);
  
  # Test that the result is numeric
  expect_that( is.numeric(temp_C), equals(TRUE) );
  
  # Test that the result is the correct value
  expect_that( temp_C, equals(10) );
})

test_that("Celsius to Fahrenheit", {
  
  temp_F <- C_to_F(10);
  
  # Test that the result is numeric
  expect_that( is.numeric(temp_F), equals(TRUE) );
  
  # Test that the result is the correct value
  expect_that( temp_F, equals(50) );
})

# This test will fail
test_that(desc = "Fahrenheit to Celsius wrong", code = {
    temp_F <- F_to_C(50);
    expect_that( object = temp_F, condition = equals(2) );
})

Note the first few lines of the file. The first line reads in the testthat R package. The second line provides some sort of title for the type of tests being run (in this case test of temperature functions). The context is important to specify due to an unusual bug that still appears unresolved in testthat; hence the context is mandatory. The source function reads in the functions to be tested; make sure that the correct directory is specified here and in R studio. For these example files, the working directory of Rstudio should be set to wherever the ‘code_testing’ folder is on your computer. Since both test-temp_conversion.R and temp_conversion.R are located immediately inside this directory (and not within other folders within it), running source("temp_conversion.R") should run all of the code in the temp_conversion.R file (if, e.g., the file was located in a folder called ‘R’, ). In other words, after source("temp_conversion.R"), the F_to_C and C_to_F functions should be read into R and ready to use by testthat.

After the testthat library is read in, the context established, and the functions to be tested sourced from the correct file, the tests themselves are added to the script. Any number of tests is permitted in a test script, and any number of scripts can be saved to the same directory. The automated test procedure will simply look for anything with the prefix ‘test-’ and extension ‘.R’ in a directory and run all of the tests.

Running all of the tests

To run all of the tests, first make sure that R is in the correct directory (i.e., the same folder location as the test scripts). You can always check the working directory with the command below.

getwd();

If the directory needs to be changed, you can specify with setwd("path/to/directory"), or use the Rstudio tool bar, going to Session > Set Working Directory. Once you are in the correct location, you can run all of the tests in the directory with one command.

test_dir(".");

Recall that in the test-temp_conversion.R file, there were five total tests in three test_that functions. The first four tests within the first two functions should pass, but the last test “Fahrenheit to Celsius wrong” was deliberately set to fail. Here is what the output of test_dir looks like.

As expected, the first four tests passed, while the last test failed. The failed test is tallied in the Results, and an explanation of the fail is flagged between two horizontal rules.

Putting it all together

With tests in place, the test_dir function can be run at any time during the coding process. As you continue to write new R scripts and use and re-use R functions, testing frequently is an excellent way to spot bugs early and pinpoint what needs to be fixed. Doing this will make it much less likely that changing your code will unintentionally cause an error, forcing you to backtrack to figure out what went wrong. Hadley Wickham also points out that having multiple unit tests will also improve your coding by forcing you to write smaller, more manageable functions. He also suggests, as a strategy, starting to code by writing tests that you know will fail, and working until they pass. For example, when you find a bug with your code, write a test that you know will fail until the bug is fixed. For more information, see Hadley Wickham’s chapter on testing.

Additional resources

Example of unit testing R code with testthat (John D. Cook)
Testing R packages (Hadley Wickam)

Introduction to testing R code

Brad Duthie

22 MAR 2023

Contents

Introduction: Why code testing?

The testthat R package

Additional resources