In this series of three posts, we’ll look at colours in R graphics produced with
ggplot2: what are the available choices of colour schemes, and how to choose a colour palette most suitable for a particular graphic?
In kindergarten, choosing a colour was easy, palettes were limited to a few classics. As cool kids grow older and use R, the spectrum expands to present us with overwhelming choice of millions of colours, most of them with poorly defined labels such as
"lavenderblush3". Inasmuch as scientific graphics resemble a paint-by-numbers game, R can help us design more elegant palettes with pertinent colour choices based on the data to display.
Base graphics rely mostly on the
grDevices package for the selection of colours, with a few palettes to choose from:
(some palettes can have many more colours, this image is only an illustration of their structure)
The package also provides a number of basic operations to convert colours (
adjustcolor, col2rgb, make.rgb, rgb2hsv, convertColor) and create interpolating palettes (
rgb, hsv, hcl, gray, colorRamp, colorRampPalette, densCols, gray.colors).
Beyond that, a good resource is the
colorspace package which provides further utilities to convert from one colorspace to another (
HLS, HSV, LAB, LUV, RGB, sRGB, XYZ) and perform various operations on colours.
A special note can be made of a few palette functions, “diverge_hcl”, “diverge_hsv”, “heat_hcl”, “rainbow_hcl”, “sequential_hcl”, “terrain_hcl”, which provide an easy way to produce colour palettes following a particular path in the colour space (varying hue with constant luminosity and saturation, for example).
While the combination of these tools is quite flexible, the user interface becomes a little bit chaotic. More recently, the
scales package has provided wrappers around these functions to provide some consistency in the naming schemes and organise the different categories of palettes in a structured way:
Utilities functions, such as
col2hcl, fullseq, muted, rescale, rescale_mid, rescale_none, rescale_pal, seq_gradient_pal, show_col
Palettes with consistent interface,
brewer_pal, dichromat_pal, gradient_n_pal , div_gradient_pal, hue_pal, grey_pal, identity_pal, manual_pal.
ggplot2 package uses
scales internally, and mirrors this structure. In this first part, we’ll review the basic commands to assign colours in ggplot2.
Let’s consider three plots for illustration:
p1 maps the colour of points to a continuous variable,
p2 maps the fill of bars to a discrete variable, and
p3 maps the fill of tiles to a continuous variable.
Fill and colour scales in ggplot2 can use the same palettes. Some shapes such as lines only accept the colour aesthetic, while others, such as polygons, accept both colour and fill aesthetics. In the latter case, the colour refers to the border of the shape, and the fill to the interior.
Another common source of confusion, general to
ggplot2, is the distinction between set values and mapped values in a layer. Consider the following example,
d = data.frame(x = 1:10, y = rnorm(10), z = gl(5, 2)) a = ggplot(d, aes(x, y, group=z)) grid.arrange(a + geom_path( colour = "red" ), a + geom_path( aes(colour = z )), nrow=1)
The default continuous scale in
ggplot2 is a blue gradient, from
low = "#132B43" to
high = "#56B1F7" which can be reproduced as
scales::seq_gradient_pal(low = "#132B43", high = "#56B1F7", space = "Lab")
The default discrete scale in
ggplot2 is a range of hues from hcl,
scales::hue_pal(h = c(0, 360) + 15, c = 100, l = 65, h.start = 0, direction = 1)
In the next post of this series we’ll describe how one can fine-tune or change altogether these default colours, and, perhaps more importantly, give some pointers on choosing an appropriate colour scheme for a particular graphic.