rank < function(
x,
ties.method = c("average", "first", "last", "random", "max", "min")
) {
ties.method < match.arg(ties.method)
switch(ties.method,
average = ,
min = ,
max = .Internal(rank(x, length(x), ties.method)),
first = sort.list(sort.list(x)),
last = sort.list(rev.default(sort.list(x, decreasing = TRUE))),
random = sort.list(order(x, stats::runif(length(x))))
)
}
x < c(1, 2, 2, 3, 3, 3)
rank(x)
#> [1] 1.0 2.5 2.5 5.0 5.0 5.0
rank(x, ties.method = "first")
#> [1] 1 2 3 4 5 6
rank(x, ties.method = "min")
#> [1] 1 2 2 4 4 4
Enumerate possible options
What’s the pattern?
If the possible values of an argument are a small set of strings, set the default argument to the set of possible values, and then use match.arg()
or rlang::arg_match()
in the function body. This convention advertises to the knowledgeable user^{1} what the possible values, and makes it easy to generate an informative error message for inappropriate inputs. This interface is often coupled with an implementation that uses switch()
.
This convention makes it possible to advertise the possible set of values for an argument. The advertisement happens in the function specification, so you see in tool tips and autocomplete, without having to look at the documentation.
What are some examples?
In
difftime()
,units
can be any one of “auto”, “secs”, “mins”, “hours”, “days”, or “weeks”.In
format()
,justify
can be “left”, “right”, “center”, or “none”.In
trimws()
, you can choosewhich
side to remove whitespace from: “both”, “left”, or “right”.In
rank()
, you can select theties.method
from one of “average”, “first”, “last”, “random”, “max”, or “min”.rank()
exposes six different methods for handling ties with theties.method
argument.quantile()
exposes nine different approaches to computing a quantile through thetype
argument.p.adjust()
exposes eight strategies for adjusting P values to account for multiple comparisons using thep.adjust.methods
argument.
How do I use this pattern?
To use this technique, set the default value to a character vector, where the first value is the default. Inside the function, use match.arg()
or rlang::arg_match()
to check that the value comes from the known good set, and pick the default if none is supplied.
Take rank()
, for example. The heart of its implementation looks like this:
Note that match.arg()
will automatically throw an error if the value is not in the set:
rank(x, ties.method = "middle")
#> Error in match.arg(ties.method): 'arg' should be one of "average", "first", "last", "random", "max", "min"
It also supports partial matching so that the following code is shorthand for ties.method = "random"
:
rank(x, ties.method = "r")
#> [1] 1 2 3 4 6 5
We prefer to avoid partial matching because while it saves a little time writing the code, it makes reading the code less clear. rlang::arg_match()
is an alternative to match.arg()
that doesn’t support partial matching. It instead provides a helpful error message:
rank2 < function(
x,
ties.method = c("average", "first", "last", "random", "max", "min")
) {
ties.method < rlang::arg_match(ties.method)
rank(x, ties.method = ties.method)
}
rank2(x, ties.method = "r")
#> Error in `rank2()`:
#> ! `ties.method` must be one of "average", "first", "last", "random",
#> "max", or "min", not "r".
#> ℹ Did you mean "random"?
# It also provides a suggestion if you misspell the argument
rank2(x, ties.method = "avarage")
#> Error in `rank2()`:
#> ! `ties.method` must be one of "average", "first", "last", "random",
#> "max", or "min", not "avarage".
#> ℹ Did you mean "average"?
Escape hatch
It’s sometimes useful to build in an escape hatch from canned strategies. This allows users to access alternative strategies, and allows for experimentation that can later turn into a official strategies. One example of such an escape hatch is in name repair, which occurs in many places throughout the tidyverse. One place you might encounter it is in tibble()
:
tibble::tibble(a = 1, a = 2)
#> Error in `tibble::tibble()`:
#> ! Column name `a` must not be duplicated.
#> Use `.name_repair` to specify repair.
#> Caused by error in `repaired_names()`:
#> ! Names must be unique.
#> ✖ These names are duplicated:
#> * "a" at locations 1 and 2.
Beneath the surface all tidyverse functions that expose some sort of name repair eventually end up calling vctrs::vec_as_names()
:
vctrs::vec_as_names(c("a", "a"), repair = "check_unique")
#> Error:
#> ! Names must be unique.
#> ✖ These names are duplicated:
#> * "a" at locations 1 and 2.
vctrs::vec_as_names(c("a", "a"), repair = "unique")
#> New names:
#> • `a` > `a...1`
#> • `a` > `a...2`
#> [1] "a...1" "a...2"
vctrs::vec_as_names(c("a", "a"), repair = "unique_quiet")
#> [1] "a...1" "a...2"
vec_as_names()
exposes six strategies, but it also allows you to supply a function:
vctrs::vec_as_names(c("a", "a"), repair = toupper)
#> [1] "A" "A"
How keep defaults short?
This technique is a best used when the set of possible values is short as otherwise you run the risk of dominating the function spec with this one argument (Chapter 9). If you have a long list of possibilities, there are three possible solutions:

Set a single default and supply the possible values to
match.arg()
/arg_match()
:rank2 < function(x, ties.method = "average") { ties.method < arg_match( ties.method, c("average", "first", "last", "random", "max", "min") ) }

If the values are used by many functions, you can store the options in an exported vector:
ties.methods < c("average", "first", "last", "random", "max", "min") rank2 < function(x, ties.method = ties.methods) { ties.method < arg_match(ties.method) }
For example
stats::p.adjust()
,stats::pairwise.prop.test()
,stats::pairwise.t.test()
,stats::pairwise.wilcox.test()
all usep.adjust.method = p.adjust.methods
. 
You can store the options in a exported named list^{2}. That has the advantage that you can advertise both the source of the values, and the defaults, and the user gets a nice autocomplete of the possible values.