We can visualize data in R using the graphics package. Graphics package is part of the standard distribution so there is no need to install or load it to the session.
Contents here are based on “R cookbook: Proven recipes for data analysis, statistics, and graphics” by Paul Teetor (2011).
There are many other packages we can use. For example, ggplot2, the so-called “Grammar of Graphics”. It is said to be easier to construct and customize plots and the graphics are generelly more attractive. For more information on how to use ggplot2, see https://rc2e.com/graphics.
There are two levels of graphics function.
High-level graphics function. It starts a new graph, sets the scale, draws some adornment, and renders the graphic. Examples include: plot()
, boxplot()
, hist()
, qqnorm()
, curve()
Low-level graphics function. It adds something to an existing graph e.g. points, lines, text, adornments. Examples include: points()
, lines()
, abline()
, segments()
, polygon()
, text()
Scatter plot is a quick way to see the relationship between \(x\) and \(y\).
plot(cars$speed, cars$dist)
If the data set contains only two columns, it is even easier.
plot(cars)
Note that the axis label changed.
Now we want to specify a title to the plot and add labels for the axes.
plot(cars,
main = "Speed VS Stopping distance (1920)",
xlab = "Speed (mph)",
ylab = "Stopping Distance (ft)")
If we want to distinguish one group from another, we may plot multiple groups in one scatter plot. Let’s do this using iris
data set.
plot(iris$Petal.Length, iris$Petal.Width)
We can add identifier to the groups using pch
argument.
plot(iris$Petal.Length, iris$Petal.Width,
pch = as.integer(iris$Species)) #Besides pch, you may use col
We can also modify the point types.
symbol <- c(0,15,25)
pointVec <- symbol[as.integer(iris$Species)]
plot(iris$Petal.Length, iris$Petal.Width,
pch = pointVec)
legend()
is one of the low-level graphic functions. Therefore, we need to call legend()
after calling plot()
.
plot(iris$Petal.Length, iris$Petal.Width,
pch = as.integer(iris$Species))
legend(1, 2.5, #Coordinates of the legend box
c("Setosa", "Versicolor", "Virginica"), #labels
pch = 1:3) #must be consistent with the argument in plot()
We want to add a line that illustrates linear regrssion of data points. abline()
is also a low-level graphic function.
m <- lm(data = cars, dist ~ 1 + speed)
plot(cars, main = "Speed VS Stopping distance (1920)",
xlab = "Speed (mph)",
ylab = "Stopping Distance (ft)")
abline(m)
We may plot histogram of numeric values. Suppose we want to plot a t-distribution with degree of freedom 25.
set.seed(2020) #set seed to ensure each draw yields the same numbers
samp <- rt(1000,25) #draw 1000 obs from t distribution with df=25
hist(samp)
We may suggest the number of bins by including a second argument in hist()
.
hist(samp, 50,
xlab = "samp", main = "Histogram of samp")
We may add a density estimate to a histogram with line()
hist(samp, 50,
xlab = "samp", main = "Histogram of samp",
prob = T)
lines(density(samp)) #density() computes kernel density
Saving your plot involves three steps
png()
, jpeg()
, pdf()
plot
and its friends to generate the graphics image.dev.off
to close the graphics file.The file will be written to your current working directory.
pdf("samp.pdf")
hist(samp, 20, prob = T)
lines(density(samp))
dev.off()
RColorBrewer
is a useful package for color schemesRColorBrewer
RColorBrewer
package provides a ready-to-use color palletes.
library(RColorBrewer)
We can easily view all of the palletes by using the following command.
display.brewer.all()
Each palette contains eight colors. We can view the colors in a single palette by using the display.brewer.pal()
command. Suppose we want to view all eight colors of palette Pastel2
display.brewer.pal(n = 8, name = 'Pastel2')
To obtain the hexadecimal color code, we need to use brewer.pal()
.
brewer.pal(n = 8, name = 'Pastel2')
## [1] "#B3E2CD" "#FDCDAC" "#CBD5E8" "#F4CAE4" "#E6F5C9" "#FFF2AE" "#F1E2CC"
## [8] "#CCCCCC"
One drawback of brewer.pal()
is that the minimum number of n
is three. If we want only one or two colors, we cannot extract the color codes directly. However, we may do the following:
themecol <- brewer.pal(n = 8, name = 'Pastel2')
themecol[1:2]
## [1] "#B3E2CD" "#FDCDAC"
Below are some of the common error messages we encounter often when dealing with graphics.
After calling a new plot, sometimes we encounter this error message: Error in plot.new(): figure margins too large. This error occurs if the Plots
panel is too small. One easy fix is to adjust the size of the panel.
Mac users may have encountered this warning message: no font could be found for family “Arial” when using plot()
.
You may solve it by doing the following: 1. Go to Finder
. 2. Search for Font Book
and open it. 3. Look for the Arial font. If it is grayed out, turn it on.