Eric Dodson
Eric Dodson

Reputation: 15

Summarize values below threshold for graph in R

I am struggling with a problem manipulating data in R.

Consider the data set:

dat <- read.table(text="Color      Count
Red         550
Blue        309
Green       296
Purple       49
Yellow       36
Brown        19", head=TRUE)

I'd like to use ggplot to graph the set with a column of "Other" summing all values < 50. I'd end up with categories: Red, Blue, Green, Other (with count of 104).

I can filter the set to exclude counts < 50 but don't know how to either create a new row with the sum of the others or achieve it in another way. BTW, it would be completely acceptable to stack the "Other" bar with the counts from Purple, Yellow, and Brown.

Upvotes: 0

Views: 354

Answers (2)

IRTFM
IRTFM

Reputation: 263451

Let's assume you have a dataframe like that named dat. Then rbind the rows with Counts >= 50 to a list made up of the desired name and the sum of the other Counts:

dat2 <- rbind( dat[dat$Count>=50, ], 
               list(Color="Other", Count=sum( dat[dat$Count<50, "Count"] ) )  
dat2
  Color Count
1   Red   550
2  Blue   309
3 Green   296
4 Other   104

Then it's just:

ggplot( data=dat2, aes(x= Color, y=Count) )+geom_col()

Upvotes: 1

Jon Spring
Jon Spring

Reputation: 66880

You could do this using forcats::fct_lump, but it uses a minimum share of the total for the threshold (like 0.04), not an absolute # like 50.

Then we can get the totals for the result categories using Count, now that a few rows have "Other" as their color.

Finally we can plot it. The default colors will not be related to the name of the Color, so here I put them in manually.

library(tidyverse)
df %>% 
  mutate(Color = fct_lump(Color, w = Count, prop = 0.04)) %>%
  count(Color, wt = Count, name = "Count") %>%
  ggplot(aes(x = 1, y = Count, fill = Color)) +
  geom_col() +
  geom_text(aes(label = Count), position = position_stack(vjust = 0.5)) +
  scale_fill_manual(values = c("blue", "green", "red", "gray70"))
  

enter image description here

Upvotes: 1

Related Questions