Enumerate possible options

What’s the pattern?

If the possible values of an argument are a small set of strings, set the default argument to the set of possible values, and then use match.arg() or rlang::arg_match() in the function body. This convention advertises to the knowledgeable user¹ what the possible values, and makes it easy to generate an informative error message for inappropriate inputs. This interface is often coupled with an implementation that uses switch().

This convention makes it possible to advertise the possible set of values for an argument. The advertisement happens in the function specification, so you see in tool tips and autocomplete, without having to look at the documentation.

What are some examples?

In difftime(), units can be any one of “auto”, “secs”, “mins”, “hours”, “days”, or “weeks”.
In format(), justify can be “left”, “right”, “center”, or “none”.
In trimws(), you can choose which side to remove whitespace from: “both”, “left”, or “right”.
In rank(), you can select the ties.method from one of “average”, “first”, “last”, “random”, “max”, or “min”.
rank() exposes six different methods for handling ties with the ties.method argument.
quantile() exposes nine different approaches to computing a quantile through the type argument.
p.adjust() exposes eight strategies for adjusting P values to account for multiple comparisons using the p.adjust.methods argument.

How do I use this pattern?

To use this technique, set the default value to a character vector, where the first value is the default. Inside the function, use match.arg() or rlang::arg_match() to check that the value comes from the known good set, and pick the default if none is supplied.

Take rank(), for example. The heart of its implementation looks like this:

rank <- function(
    x,
    ties.method = c("average", "first", "last", "random", "max", "min")
) {
  
  ties.method <- match.arg(ties.method)
  
  switch(ties.method, 
    average = , 
    min = , 
    max = .Internal(rank(x, length(x), ties.method)), 
    first = sort.list(sort.list(x)),
    last = sort.list(rev.default(sort.list(x, decreasing = TRUE))), 
    random = sort.list(order(x, stats::runif(length(x))))
  )
}

x <- c(1, 2, 2, 3, 3, 3)

rank(x)
#> [1] 1.0 2.5 2.5 5.0 5.0 5.0
rank(x, ties.method = "first")
#> [1] 1 2 3 4 5 6
rank(x, ties.method = "min")
#> [1] 1 2 2 4 4 4

Note that match.arg() will automatically throw an error if the value is not in the set:

rank(x, ties.method = "middle")
#> Error in match.arg(ties.method): 'arg' should be one of "average", "first", "last", "random", "max", "min"

It also supports partial matching so that the following code is shorthand for ties.method = "random":

rank(x, ties.method = "r")
#> [1] 1 2 3 4 6 5

We prefer to avoid partial matching because while it saves a little time writing the code, it makes reading the code less clear. rlang::arg_match() is an alternative to match.arg() that doesn’t support partial matching. It instead provides a helpful error message:

rank2 <- function(
    x,
    ties.method = c("average", "first", "last", "random", "max", "min")
) {
  ties.method <- rlang::arg_match(ties.method)
  rank(x, ties.method = ties.method)
}

rank2(x, ties.method = "r")
#> Error in `rank2()`:
#> ! `ties.method` must be one of "average", "first", "last", "random",
#>   "max", or "min", not "r".
#> ℹ Did you mean "random"?

# It also provides a suggestion if you misspell the argument
rank2(x, ties.method = "avarage")
#> Error in `rank2()`:
#> ! `ties.method` must be one of "average", "first", "last", "random",
#>   "max", or "min", not "avarage".
#> ℹ Did you mean "average"?

Escape hatch

It’s sometimes useful to build in an escape hatch from canned strategies. This allows users to access alternative strategies, and allows for experimentation that can later turn into a official strategies. One example of such an escape hatch is in name repair, which occurs in many places throughout the tidyverse. One place you might encounter it is in tibble():

tibble::tibble(a = 1, a = 2)
#> Error in `tibble::tibble()`:
#> ! Column name `a` must not be duplicated.
#> Use `.name_repair` to specify repair.
#> Caused by error in `repaired_names()`:
#> ! Names must be unique.
#> ✖ These names are duplicated:
#>   * "a" at locations 1 and 2.

Beneath the surface all tidyverse functions that expose some sort of name repair eventually end up calling vctrs::vec_as_names():

vctrs::vec_as_names(c("a", "a"), repair = "check_unique")
#> Error:
#> ! Names must be unique.
#> ✖ These names are duplicated:
#>   * "a" at locations 1 and 2.
vctrs::vec_as_names(c("a", "a"), repair = "unique")
#> New names:
#> • `a` -> `a...1`
#> • `a` -> `a...2`
#> [1] "a...1" "a...2"
vctrs::vec_as_names(c("a", "a"), repair = "unique_quiet")
#> [1] "a...1" "a...2"

vec_as_names() exposes six strategies, but it also allows you to supply a function:

vctrs::vec_as_names(c("a", "a"), repair = toupper)
#> [1] "A" "A"

How keep defaults short?

This technique is a best used when the set of possible values is short as otherwise you run the risk of dominating the function spec with this one argument (Chapter 9). If you have a long list of possibilities, there are three possible solutions:

Set a single default and supply the possible values to match.arg()/arg_match():

rank2 <- function(x, ties.method = "average") {
  ties.method <- arg_match(
    ties.method, 
    c("average", "first", "last", "random", "max", "min")
  )
}

If the values are used by many functions, you can store the options in an exported vector:
```
ties.methods <- c("average", "first", "last", "random", "max", "min")

rank2 <- function(x, ties.method = ties.methods) {
  ties.method <- arg_match(ties.method)
}
```
For example stats::p.adjust(), stats::pairwise.prop.test(), stats::pairwise.t.test(), stats::pairwise.wilcox.test() all use p.adjust.method = p.adjust.methods.

You can store the options in a exported named list². That has the advantage that you can advertise both the source of the values, and the defaults, and the user gets a nice auto-complete of the possible values.

library(rlang)
ties <- as.list(set_names(c("average", "first", "last", "random", "max", "min")))

rank2 <- function(x, ties.method = ties$average) {
  ties.method <- arg_match(ties.method, names(ties))
}

The main downside of this technique is that many users aren’t aware of this convention and that the first value of the vector will be used as a default.↩︎
Thanks to Brandon Loudermilk↩︎