cowboy
cowboy

Reputation: 661

Combined multiple operations in R using DPLYR

I'm trying to use DPLYR to retrieve and summarize data. I wrote the below and it works, but I would like to combine this all into one statement. Is this possible?

create datasets

set.seed(1)
dbo_games <- data.frame(
  name = sample(c("Team1","Team2","Team3","Team4","Team5","Team6","Team7","Team8","Team9","Team10")),
  total_games = sample(1:10)

)

set.seed(1)
dbo_wins <- data.frame(
  name = sample(c("Team1","Team2","Team3","Team4","Team5","Team6","Team7","Team8","Team9","Team10")),
  tota_wins = sample(c("yes", "no"), 10, replace = TRUE)
)
total_games <- con %>% tbl("dbo_games")
total_wins <- con %>% tbl("dbo_wins")

total<- total_games %>% filter(games > 12) %>%
  group_by(NAME) %>%
  summarise(total_games = n_distinct(game_id)) %>% collect()

wins <- total_wins %>% filter( win == 'Y') %>%
  group_by(NAME) %>%
  summarise(total_wins = n_distinct(game_id)) %>% collect()

perc_win <- total %>% left_join(wins) %>%
  mutate(pct_won = total_wins/total_games)

This code works, but I believe there is likely a more succinct way of writing the code to achieve the same results. Any thoughts?

Upvotes: 0

Views: 581

Answers (1)

Sonny
Sonny

Reputation: 3183

It would have been easier to address this if you had shared sample data and why you are doing what you are doing.

However, you could still chain them together as below:

total_games %>%
  filter(games > 12) %>%
  group_by(NAME) %>%
  summarise(total_games = n_distinct(game_id)) %>%
  left_join(total_wins %>% filter( win == 'Y') %>%
              group_by(NAME) %>%
              summarise(total_wins = n_distinct(game_id))) %>%
  mutate(pct_won = total_wins/total_games)

Upvotes: 1

Related Questions