The (ongoing) Maddening State of R Documentation

I used to use R a lot, and I (slowly) started to like it. There are many things to like about the R/tidyverse ecosystem. At the lowest level, the S3 object model is really elegant and powerful - it’s essentially classes as tags. Further up, dplyr makes it easy to do complex data transformations, in ways that make sense six months later. And finally, there is no plotting library as nice as ggplot.

But, R documentation seems to be designed to be bad. Somewhere, someone found a checklist for “Things not to do in software documentation,” and decided to use it as a best practices guide instead. And this manifests itself so powerfully in the documentation for ggplot’s [discrete color scales1]1. There are two functions documented on this page, scale_colour_discrete and scale_fill_discrete.

The page also has a number of examples of how you can use these functions. It even calls out that a good reason to do it is to get a color-blind safe pallete. The examples are twice as long as the actual function documentation. (althugh it does have plots, which pumps it up a bit.)

Documentation of a function where the examples don't actually use the function.
It sort of covers the function, but you need to know about options

What the examples do not have is an actual use of scale_color_discrete (or fill). They also use a relatively advanced technqiue to make the changes temporary if you were to run the example. All of this combines to make the examples really unhelpful. It made sense to me, but that is only because I’ve spent years using ggplot, and I know how it is put together. There are no pathways to any additional documentation that might help scaffold a user’s education.

It’s particularly disappointing because this is a teachable moment. An extra paragraph that says something like “GGplot uses ‘options’ to customize a graph, and you can use this to make small or large changes to a graph. These functions are mostly just convenice functions to make it easier to make these changes. In the examples we’ll show both these functions, and touching the underlying options directly.”

I haven’t been using R as much over the past few months. I only fired it up because I needed to plot the positive values of the ellipse equation against some observed values. I din’t know enough Python to deal with the graphics part, and I don’t have Jupyter on my personal laptop. But, while I am slowly re-skilling myself on Python, I hve never once wanted to throw the documentaiton across the room.

  1. I was looking at V3.5.1 

Updated: