5 Code Documentation

5.1 Introduction

Documentation of code is essential, even if the only person using your code is future-you. Use roxygen2 with enabled markdown support to keep your documentation close to the code.

5.2 Title and description

For the title, describe concisely what the function does in the very first line of your function documentation. Titles should use sentence case but not end with a full stop.

#' Combine values into a vector or list
#' This is a generic function which combines its arguments.

There is no need to use the explicit @title or @description tags, except in the case of the description if it is multiple paragraphs or includes more complex formatting like a bulleted list.

#' Apply a function to each element of a vector
#' @description
#' The map function transform the input, returning a vector the same length
#' as the input. `map()` returns a list or a data frame; `map_lgl()`,
#' `map_int()`, `map_dbl()` and `map_chr()` return vectors of the
#' corresponding type (or die trying); `map_dfr()` and `map_dfc()` return
#' data frames created by row-binding and column-binding respectively.
#' They require dplyr to be installed.

5.3 Indention

Always indent with one space after #'. If any description corresponding to a roxygen tag spans over multiple lines, add another two spaces of extra indention.

#' @param key The bare (unquoted) name of the column whose values will be used 
#'   as column headings. 

Alternatively, tags that span over multiple lines (like @examples and @section) can have the corresponding tag on its own line and then subsequent lines don’t need extra indention.

#' @examples
#' 1 + 1
#' sin(pi)

The section ‘Description’ does not need extra indention either.

5.4 Documenting parameters

For most tags, like @param, @seealso and @return, the text should be a sentence, starting with a capital letter and ending with a full stop.

#' @param key The bare (unquoted) name of the column whose values will be used 
#'   as column headings. 

If some functions share parameters, you can use @inheritParams to avoid duplication of content in multiple places.

#' @inheritParams argument function_to_inherit_from

5.5 Capitalization and full stops

For all bullets, enumerations, argument descriptions and the like, use sentence case and put a period at the end of each text element, even if it is only a few words. However, avoid capitalization of function names or packages since R is case sensitive. Use a colon before enumerations or bulleted lists.

#' @details In the following, we present the bullets of the list:
#' * Four cats are few animals.
#' * forcats is a package.

5.6 Reference

Cross-referencing is encouraged, both within R’s help file system as well as to external resources. Include parentheses after function names when referencing function calls.

#' @seealso [fct_lump()] 

If you have a family of related functions, you can use the @family tag to automatically add appropriate lists and interlinks to the @seealso section. Family names are plural. In dplyr, the verbs arrange, filter, mutate, slice, summarize form the family of single table verbs.

#' @family single table verbs

5.7 Line break

Leave one line blank before / after each description.

#' @section Tidy data:
#' When applied to a data frame, row names are silently dropped. To preserve,
#' convert to an explicit variable with [tibble::rownames_to_column()].
#' @section Scoped filtering:
#' The three [scoped] variants ([filter_all()], [filter_if()] and
#' [filter_at()]) make it easy to apply a filtering condition to a
#' selection of variables.

If you want to insert a line break within a section, also leave a line break in the roxygen comments.

#' @param ... Data frames to combine.
#'   Each argument can either be a data frame, a list that could be a data
#'   frame, or a list of data frames.
#'   When row-binding, columns are matched by name, and any missing
#'   columns with be filled with NA.
#'   When column-binding, rows are matched by position, so all data
#'   frames must have the same number of rows. To match by value, not
#'   position, see [join()].

5.8 Code font

Text that contains valid R code should be marked as such.

  • Function names, which should be followed by (), e.g. tibble().
  • Function arguments, e.g. na.rm.
  • Values, e.g. TRUE, FALSE, NA, NaN, ..., NULL
  • Literal R code, e.g. mean(x, na.rm = TRUE)

Do not use code font for package names. If package name is ambiguous in the context, disambiguate with words, e.g. “the foo package”.

5.9 Internal functions

Internal functions should be documented with #' comments as per usual. Use the @noRd tag to prevent .Rd files from being generated.

#' Drop last
#' Drops the last element from a vector.
#' @param x A vector object to be trimmed.
#' @noRd