my_fun_opts <- function(opt1 = 1, opt2 = 2) {
list(
opt1 = opt1,
opt2 = opt2
)
}
Reduce argument clutter with an options object
What’s the problem?
If you have a large number of optional arguments that control the fine details of the operation of a function, it might be worth lumping them all together into a separate “options” object created by a helper function.
Having a large number of less important arguments makes it harder to see the most important. By moving rarely used and less important arguments to a secondary function, you can more easily draw attention to what is most important.
What are some examples?
-
Many base R modelling functions like
loess()
,glm()
, andnls()
have acontrol
argument that are paired with a function likeloess.control()
,glm.control()
, andnls.control()
. These allow you to modify rarely used defaults, including the number of iterations, the stopping criteria, and some debugging options.optim()
uses a less formal version of this structure — while it has acontrol
argument, it doesn’t have a matchingoptim.control()
helper. Instead, you supply a named list with components described in?optim
. A helper function is more convenient than a named list because it checks the argument names for free and gives nicer autocomplete to the user. This pattern is common in other modelling packages, e.g.
tune::fit_resamples()
+tune::control_resamples()
,tune::control_bayes()
,tune::control_grid()
, andcaret::train()
+caret::trainControl()
readr::read_delim()
and friends take alocale
argument which is paired with thereadr::locale()
helper. This object bundles together a bunch of options related to parsing numbers, dates, and times that vary from country to country.readr::locale()
itself has adate_names
argument that’s paired withreadr::date_names()
andreadr::date_names_lang()
helpers. You typically use the argument by supplying a two letter locale (whichdate_names_lang()
uses to look up common languages), but if your language isn’t supported you can usereadr::date_names()
to individually supply full and abbreviated month and day of week names.
On the other hand, some functions with many arguments that would benefit from this technique include:
readr::read_delim()
has a lot of options that control rarely needed details of file parsing (e.g.escape_backslash
,escape_double
,quoted_na
,comment
,trim_ws)
. These make the function specification very long and might well be better in a details object.ggplot2::geom_smooth()
fits a smooth line to your data. Most of the time you only want to pick themodel
andformula
used, butgeom_smooth()
(viaggplot2::stat_smooth()
) also providesn
,fullrange
,span
,level
, andmethod.args
to control details of the fit. I think these would be better in their own details object.
How do I use this pattern?
The simplest implementation is just to write a helper function that returns a list:
This alone is nice because you can document the individual arguments, you get name checking for free, and auto-complete will remind the user what these less important options include.
Better error messages
An optional extra is to add a unique class to the list:
This then allows you to create more informative error messages:
If you use this option in many places, you should consider pulling out the repeated code into a check_my_fun_opts()
function.
How do I remediate past mistakes?
Typically you notice this problem only after you have created too many options so you’ll need to carefully remediate by introducing a new options argument and paired helper function. For example, if your existing function looks like this:
my_fun <- function(x, y, opt1 = 1, opt2 = 2) {
}
If you want to keep the existing function specification you could add a new opts
argument that uses the values of opt1
and opt2:
my_fun <- function(x, y, opts = NULL, opt1 = 1, opt2 = 2) {
opts <- opts %||% my_fun_opts(opt1 = opt1, opt2 = opt2)
}
However, that introduces a dependency between the arguments: if you specify both opts
and opt1
/opt2
, opts
will win. You could certainly add extra code to pick up on this problem and warn the user, but I think it’s just cleaner to deprecate the old arguments so that you can eventually remove them:
my_fun <- function(x, y, opts = my_fun_opts(), opt1 = deprecated(), opt2 = deprecated()) {
if (lifecycle::is_present(opt1)) {
lifecycle::deprecate_warn("1.0.0", "my_fun(opt1)", "my_fun_opts(opt1)")
opts$opt1 <- opt1
}
if (lifecycle::is_present(opt2)) {
lifecycle::deprecate_warn("1.0.0", "my_fun(opt2)", "my_fun_opts(opt2)")
opts$opt2 <- opt2
}
}
Then you can remove the old arguments in a future release.
See also
- Chapter 14 is a similar pattern when you have multiple options function that each encapsulate a different strategy.