Argument meaning should be independent

What’s the problem?

Avoid having one argument change the interpretation of another argument. This makes it harder to understand a call, because as you read a call, you might need to go back and to re-interpret an earlier argument. This sort of call can lead to code that reads like a garden path sentence, a human language problem where you need to re-parse a sentence when you get to the end of it. For example, in “the horse raced past the barn fell”, your initial understanding of”raced” needs to be modified when you to get to the end of the sentence in order for it to make sense.

Another way you will come across this problem is when only certain combinations of arguments are allowed or if one argument is ignored if another argument has a certain value.

What are some examples?

There aren’t too many examples of one argument changing the meaning of another argument but here are few I dug up:

  • In library() the character.only argument changes how the package argument is interpreted:

    package <- "dplyr"
    
    # Loads a package called "package"
    library(package)
    
    # Loads dplyr
    library(package, character.only = TRUE)
  • In ggplot2::geom_text() setting parse = TRUE causes the contents of the label aesthetic to be interpreted as mathematical equations, rather simple text.

  • In install.packages() setting repos = NULL changes the interpretation of pkgs from being a vector of package names to a vector of file paths.

  • In findInterval() if you set left.open = TRUE then the rightmost.closed argument actually controls the whether or not the leftmost interval is closed.

A subtler example of this problem also arises in grepl() and friends where you can’t fully interpret the pattern until you have seen if the fixed argument is set. This is one of the patterns that heavily influenced the design of stringr, and is discussed more in Chapter 14.

There are quite a few functions that only allow certain combinations of arguments:

  • read.table() allows you to supply data with either a path to a file, or with in line text. If you supply both, path wins.

How do I remediate past mistakes?

There isn’t a single solution to this problem and remediating the problem will require a situation dependent technique. For example, each of the cases above requires a different technique:

  • library() could use the same mechanism as help() where help((topic))1 will always look for the topic recorded in the topic variable, rather than the topic literally called “topic”.

  • Instead of geom_text(parse = TRUE), maybe it would be better to have geom_equation(). However, this change would be challenging because ggplot2 has another functions with the parse argument: geom_label(), which is like geom_text() but draws a rectangle behind the text, often making it easier to read. Maybe it would be better to make this an argument (background = TRUE) but that would leave three arguments (label.r, label.padding, label.size) that only make sense when background = TRUE. So maybe we could do something like background = label(), where the new label function would have r, padding, and size arguments. This would also make it possible to specify different types of backgrounds.

  • install.packages() feels like a function that has grown organically over time: it started out simple, but gained more and more features over time. I suspect improving the design would involve recognising Chapter 15 and breaking it apart into multiple functions, possibly using Chapter 11 for the common, less important arguments.

  • findInterval() could be fixed by using an argument name that isn’t direction specific. One possible option would be extemum.closed. Extremum is technical term that most people probably aren’t familiar with, but it’s also for a fairly uncommon argument in a rarely function, so is probably fine.

Cases where arguments have complex dependencies often require techniques from the “Stategies” part of the book.


  1. If library() were a tidyverse function it would use tidyeval, and so you’d write library(!!package) if you wanted to refer to the package name stored in the package variable.↩︎