Anonymous Economist
Anonymous Economist

Reputation: 25

Filter rows in relation to value held within the same column

I have a dataset containing the service level agreement (SLA) scores of various suppliers and want to use r/rmarkdown to generate a report each month that highlights the suppliers doing well and those who are underperforming.

A reprex of the data is included below. How can I filter the data based on whether the supplier achieves the benchmark score? I would like it to return the supplier names, SLAs, and scores for those entries where the corresponding benchmark score has not been achieved. "Is the score for this supplier and this SLA higher than the corresponding benchmark SLA score?"

supplier_name <- c("benchmark", "benchmark", "benchmark", "supplier1", "supplier1", "supplier1", "supplier2", "supplier2", "supplier2", "supplier3", "supplier3", "supplier3")
sla <- c("sla1", "sla2", "sla3", "sla1", "sla2", "sla3", "sla1", "sla2", "sla3", "sla1", "sla2", "sla3")
score <- c("100", "95", "100", "100", "100", "99", "100", "98", "100", "80", "82", "95")
df <- tibble(supplier_name, sla, score)

Is my data in a suitable shape? or would it be more appropriate to have a vector containing the benchmark scores to use in some comparison?

Thanks.

Upvotes: 0

Views: 62

Answers (2)

Peter
Peter

Reputation: 12699

If you wanted a graphical approach, you could try...

Note, with the data organised more rationally this could be simplified. It you've got lots of suppliers adding jitter to the points will help distinguish suppliers.

library(tibble)
library(dplyr)
library(ggplot2)


df_bmk <- 
  df %>% 
  filter(supplier_name == "benchmark")

df_sup <- 
  df %>% 
  filter(supplier_name != "benchmark") %>% 
  mutate(score = as.numeric(score))

ggplot(df_sup, aes(sla, score, colour = supplier_name))+
  geom_point()+
  geom_hline(data = df_bmk, aes(yintercept = as.numeric(score)))+
  facet_wrap(~sla, scale = "free_x")+
  labs(x = NULL)+
  theme(axis.text.x = element_blank(),
        axis.ticks.x = element_blank())

Created on 2021-09-13 by the reprex package (v2.0.0)

Upvotes: 1

walter
walter

Reputation: 528

I would split it into two data frames, so that you can join the benchmark onto each supplier.

library(dplyr)
benchmark <- df %>% filter(supplier_name == "benchmark")
suppliers <- df %>% filter(supplier_name != "benchmark")

suppliers %>% left_join(benchmark, by=c("sla"), suffix=c("", ".benchmark")) %>% filter(score < score.benchmark)

Upvotes: 1

Related Questions