Ben
Ben

Reputation: 42343

Combine polar histogram with polar scatterplot

I want to draw a plot that combines a polar histogram (of compass bearing measurements) with a polar scatterplot (indicating the dip and bearing values). For example, this is what I would like to produce (source):

enter image description here

Let's ignore that the absolute values of the scale of the histogram bars meaningless; we're showing the histogram for comparison within the plot, not to read exact values (this is a conventional plot in geology). The histogram y-axis text is usually not shown in these plots.

The points are showing their bearing (angle from vertical) and dip (distance from centre). Dip is always between 0 and 90 degrees, and bearing is always 0-360 degrees.

I can get some of the way, but I'm stuck with the mismatch between the scale of the histogram (in the example below, 0-20) and the scale of the scatterplot (always 0-90 because it's a dip measurement).

Here's my example:

n <-  100
bearing <- runif(min = 0, max = 360, n = n) 
dip <- runif(min = 0, max = 90, n = n)

library(ggplot2)
ggplot() +
  geom_point(aes(bearing, 
                 dip),
             alpha = 0.4) +
  geom_histogram(aes(bearing),
                 colour = "black", 
                 fill = "grey80") +
  coord_polar() +
  theme(axis.text.x = element_text(size = 18)) +
  coord_polar(start = 90 * pi/180) +
  scale_x_continuous(limits = c(0, 360), 
                     breaks = (c(0, 90, 180, 270))) +
  theme_minimal(base_size = 14) +
  xlab("") +
  ylab("") +
  theme(axis.text.y=element_blank())

enter image description here

If you look closely you can see a tiny histogram at the centre of the circle.

How can I get the histogram to look like the plot at the top, so that the histogram is auto-scaled so that the highest bar is equal to the radius of the circle (ie. 90)?

Upvotes: 2

Views: 2323

Answers (2)

m-dz
m-dz

Reputation: 2362

to_barplot probably could be made in a simpler way, but here it is:

library(Hmisc)
library(dplyr)

set.seed(2016)
n <-  100
bearing <- runif(min = 0, max = 360, n = n) 
dip <- runif(min = 0, max = 90, n = n)

rescale_prop <- function(x, a, b, min_x = min(x), max_x = max(x)) {
  (b-a)*(x-min_x)/(max_x-min_x) + a
}

to_barplot <- bearing %>%
  cut2(cuts = seq(0, 360, 20)) %>%
  table(useNA = "no") %>%
  as.integer() %>%
  rescale_prop(0, 90, min_x = 0) %>%  # min_x = 0 to keep min value > 0 (if higher than 0 of course)
  data.frame(x = seq(10, 350, 20),
             y = .)

library(ggplot2)
ggplot() +
  geom_bar(data = to_barplot,
           aes(x = x, y = y),
           colour = "black", 
           fill = "grey80",
           stat = "identity") +
  geom_point(aes(bearing, 
                 dip),
             alpha = 0.4) +
  geom_hline(aes(yintercept = 90), colour = "red") + 
  coord_polar() +
  theme(axis.text.x = element_text(size = 18)) +
  coord_polar(start = 90 * pi/180) +
  scale_x_continuous(limits = c(0, 360), 
                     breaks = (c(0, 90, 180, 270))) +
  theme_minimal(base_size = 14) +
  xlab("") +
  ylab("") +
  theme(axis.text.y=element_blank())

Result:

result

Upvotes: 2

thepule
thepule

Reputation: 1751

This is not a final solution, but I think it moves in the right direction. The issue here is that the scale of the histogram is quite different from the scale of the points. By scale I intend the max y value.

If you rescale the points, you can get this:

scaling <- dip / 9
ggplot()  +
    geom_point(aes(bearing, 
                   scaling),
               alpha = 0.4) +
    geom_histogram(aes(bearing),
                   colour = "black", 
                   fill = "grey80") +
    coord_polar() +
    theme(axis.text.x = element_text(size = 18)) +
    coord_polar(start = 90 * pi/180) +
    scale_x_continuous(limits = c(0, 360), 
                       breaks = (c(0, 90, 180, 270))) +
    theme_minimal(base_size = 14) +
    xlab("") +
    ylab("") +
    theme(axis.text.y=element_blank())

enter image description here

Here I came to the number for the scaling, heuristically. The next step is figuring out an algorithmic way of defining it. Something like: take the max y value for the points and divide it by the max y value for the histogram.

Upvotes: 2

Related Questions