Fassold
Fassold

Reputation: 17

How do i calculate average rate from a table?

Compute a table but with rates computed over 1999-2001. Keep only rows from 1999-2001 where players have 100 or more plate appearances, calculate each player's single rate and BB rate per season, then calculate the average single rate (mean_singles) and average BB rate (mean_bb) per player over those three seasons.

How many players had a single rate mean_singles of greater than 0.2 per plate appearance over 1999-2001?

library(tidyverse) 
library(Lahman)  

bat_02 <- Batting %>% filter(yearID %in% c("1999","2000","2001")) %>%
    mutate(pa = AB + BB, singles = (H - X2B - X3B - HR)/pa, bb = BB/pa) %>%
    filter(pa >= 100) %>%
    select(playerID, singles, bb)
        
bat_02 <- bat_02 %>% filter(singles > .2)
nrow(bat_02)

I have filtered the tables so it contain players with 100 or more plates appearance in year 1999-2001. I filtered the singles row with the condition: singles is more than 0.2. The following code gave me an output of 133, which is not correct. Is there any mistake in my code?

Upvotes: 0

Views: 783

Answers (3)

DnLusho
DnLusho

Reputation: 1

To me the following resulted perfect:

  1. How many players had a single rate mean_singles of greater than 0.2 per plate appearance over 1999-2001?
library(Lahman)

bat_02 <- Batting %>% filter(yearID == 2002) %>%
    mutate(pa = AB + BB, singles = (H - X2B - X3B - HR)/pa, bb = BB/pa) %>%
    filter(pa >= 100) %>%
    select(playerID, singles, bb)

bat_99_01 <- Batting %>% filter(yearID %in% 1999:2001) %>%
    mutate(pa = AB + BB, singles = (H - X2B - X3B - HR)/pa, bb = BB/pa) %>%
    filter(pa >= 100) %>%
    group_by(playerID) %>%
    summarize(mean_singles = mean(singles), mean_bb = mean(bb))
sum(bat_99_01$mean_singles > 0.2)

# The result:
[1] 46

  1. How many players had a BB rate mean_bb of greater than 0.2 per plate appearance over 1999-2001?
sum(bat_99_01$mean_bb > 0.2)

# Answer:

[1] 3

Upvotes: 0

kurianoff
kurianoff

Reputation: 21

Here's the code that properly computes the required averages:

library(Lahman)

# Compute required averages for years 1999-2001
averages <- Batting %>% filter(yearID %in% c("1999","2000","2001")) %>%
  mutate(pa = AB + BB, singles = (H - X2B - X3B - HR)/pa, bb = BB/pa) %>%
  filter(pa >= 100) %>%
  group_by(playerID) %>%
  summarize(mean_singles = mean(singles), mean_bb = mean(bb)) %>%
  select(playerID, mean_singles, mean_bb)

# Select mean_singles and mean_bb higher than 0.2 as required by the task
averages %>% filter(mean_singles > 0.2) %>% nrow(.)
averages %>% filter(mean_bb > 0.2) %>% nrow(.)

The key here is a summarize operation that computes averages based on the grouping by playerID (see the group_by(playerID) section).

Upvotes: 0

Edward
Edward

Reputation: 19169

This is my take on the question.

library(Lahman)
library(dplyr)

str(Batting)

Batting %>% 
  #Compute a table but with rates computed over 1999-2001.
  filter(yearID %in% c("1999","2000","2001")) %>%

  #Keep only rows from 1999-2001 where players have 100 or more plate appearances
  mutate(pa = AB + BB) %>%
  filter(pa >= 100) %>%

  #calculate each player's single rate and BB rate per season
  group_by(playerID, yearID) %>%
  summarise(singles = (H - X2B - X3B - HR)/pa, bb = BB/pa) %>%

  #then calculate the average single rate (mean_singles) and average BB rate (mean_bb) per player over those three seasons.
  group_by(yearID) %>%
  summarise(mean_single=mean(singles), mean_bb=mean(bb))

# A tibble: 3 x 3
  yearID mean_single mean_bb
   <int>       <dbl>   <dbl>
1   1999       0.137  0.0780
2   2000       0.140  0.0765
3   2001       0.132  0.0634

Or perhaps the question wanted just the overall rates:

  #then calculate the average single rate (mean_singles) and average BB rate (mean_bb) per player over those three seasons.
  ungroup() %>%
  summarise(mean_single=mean(singles, na.rm=TRUE), mean_bb=mean(bb, na.rm=TRUE))
# A tibble: 1 x 2
  mean_single mean_bb
        <dbl>   <dbl>
1       0.136  0.0726

Upvotes: 3

Related Questions