Name all but the most important arguments
What’s the pattern?
When calling a function, you should name all but the most important arguments. For example:
Never use partial matching, like below. Partial matching was useful in the early days of R because when you were doing a quick and dirty interactive analysis you could save a little time by shortening argument names. However, today, most R editing environments support autocomplete so partial matching only saves you a single keystroke, and it makes code substantially harder to read.
mean(y, n = TRUE)
#> [1] 5.5
Avoid relying on position matching with empty arguments:
mean(y, , TRUE)
#> [1] 5.5
And don’t name arguments that can you expect users to be familiar with:
mean(x = y)
#> [1] NA
You can make R give you are warning that you’re using a partially named argument with a special option. Call usethis::use_partial_warnings()
to make this the default for all R sessions.
Why is this useful?
I think it’s reasonable to assume that the reader knows what a function does then they know what the one or two most important arguments are, and repeating their names just takes up space without aiding communication. For example, it’s reasonable to assume that people can remember that the first argument to log()
is x
and the first two arguments to dplyr::left_join()
are x
and y
.
However, I don’t think that most people will remember more than the one or two most important arguments, so you should name the rest. For example, I don’t think that most people know that the second argument to mean()
is trim
or that the second argument to median()
is na.rm
even though I expect most people to know what the first arguments are. Spelling out the names makes it easier to understand when others (including future you) are reading the code.
What are the exceptions?
There are two main exceptions to this principle: when teaching functions and when one argument is particularly long.
When teaching a function for the first time, you can’t expect people to know what the arguments are, so it make sense to supply all names to help people understand exactly what’s going on. For example, in R for Data Science when we introduce ggplot2 we write code like:
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point()
At the end of the chapter, we assume that the reader is familiar with the basic structure and so the rest of the book uses the style recommended here:
ggplot(mpg, aes(`displ, hwy)) +
geom_point()
There are also the occasional case when the first argument might be quite long, and there’s a couple of short options that you also want to set. If the long argument comes first, you may have to re-interpret what the function is doing when you finally hit the options. I think this comes up most often when an argument usually receives code inside of {}
but it can crop up when manually generating data too.
writeLines(con = "test.txt", c(
"line1",
"line2",
"line3"
))
expect_snapshot(error = TRUE, {
line1
line2
line3
})