Overplotting
Learn about overplotting and how to address it using some simple techniques.
We'll cover the following
The large mass of points near (0, 0) in the following figure can cause some confusion because it’s hard to tell the true number of points that are plotted. This is the result of a phenomenon called overplotting. As one can guess, this corresponds to points being plotted on top of each other over and over again. There are two methods to address the issue of overplotting. Either by:
Adjusting the transparency of the points
Adding a little random jitter or random nudges to each of the points
Method 1: Changing the transparency
The first way of addressing overplotting is to change the transparency of the points by setting the alpha
argument in geom_point()
. We can change the alpha
argument to be any value between 0 and 1, where 0 sets the points to be 100% transparent and 1 sets the points to be 100% opaque. By default, the alpha
is set to 1. In other words, if we don’t explicitly set an alpha
value, R will use alpha = 1
.
Note how the following code is identical to the code that created the scatterplot with overplotting (in the previous lesson) but with alpha = 0.2
added to the geom_point()
function:
Get hands-on with 1400+ tech skills courses.