tall_table
tall_table

Reputation: 311

ordering and plotting by one variable conditional on a second

Task: I would like to reorder a factor variable by the difference between the factor variable when a second variable equals 1 and the factor variable when the second variable equals 0. Here is a reproducible example to clarify:

# Package
    library(tidyverse)
# Create fake data
    df1 <- data.frame(place = c("A", "B", "C"),
                 avg = c(3.4, 4.5, 1.8))

# Plot, but it's not in order of value
    ggplot(df1, aes(x = place, y = avg)) + 
      geom_point(size = 4)

# Now put it in order
    df1$place <- factor(df1$place, levels = df1$place[order(df1$avg)])

# Plots in order now
    ggplot(df1, aes(x = place, y = avg)) + 
      geom_point(size = 4)

# Adding second, conditional variable (called: new)
    df2 <- data.frame(place = c("A", "A", "B", "B", "C", "C"),
                 new = rep(0:1, 3),
                 avg = c(3.4, 2.3, 4.5, 4.2, 2.1, 1.8))

    ggplot(df2, aes(x = place, y = avg, col = factor(new))) +
      geom_point(size = 3)

Goal: I would like to order and plot the factor variable place by the difference of avg between place when new is 1 and place when new is 0

Upvotes: 0

Views: 342

Answers (3)

neilfws
neilfws

Reputation: 33812

If I understand the goal correctly, then factor A has the biggest difference:

avg(new = 0) - avg(new = 1) = 1.1

So you can spread the data frame to calculate the difference, then gather, then plot avg versus place, reordered by diff. Or if you want A first, by -diff.

But let me know if I didn't understand correctly :)

df2 %>% 
  spread(new, avg) %>% 
  mutate(diff = `0` - `1`) %>% 
  gather(new, avg, -diff, -place) %>% 
  ggplot(aes(reorder(place, diff), avg)) + 
    geom_point(aes(color =factor(new)), size = 3)

enter image description here

Upvotes: 1

akuiper
akuiper

Reputation: 215127

You can create the levels for the place column by:

library(tidyr)
df2$place <- factor(df2$place, levels=with(spread(df2, new, avg), place[order(`1` - `0`)]))

ggplot(df2, aes(x = place, y = avg, col = factor(new))) +
    geom_point(size = 3) + labs(color = 'new')

gives:

enter image description here

Upvotes: 1

Mako212
Mako212

Reputation: 7312

Calculate the column first using dplyr:

df2 %>% group_by(place) %>% mutate(diff=diff(avg))

ggplot(df2, aes(x=place, y=diff, color=diff)+
  geom_point(size=3)

Upvotes: 0

Related Questions