User 2014
User 2014

Reputation: 183

Calculate average points in each bin of a shot chart with R

I'm trying to make a shot chart in which the color gradient represents the average of success in each bin.

The next script gives the count of each bin, How can I change it to represent average of success in each bin instead the count? I attach the script output chart.

#rm(list=ls())
data3<-read.csv("data10.csv",header=T)

require(jpeg)
require(grid)
court<-rasterGrob(readJPEG("nba_court.jpg"),
                   width=unit(1,"npc"), height=unit(1,"npc"))

require(hexbin)
require(ggplot2)
ggplot(data3, aes(x=loc_x, y=loc_y)) + 
#  annotation_custom(court, -247, 253, -50, 418) +
  stat_binhex(bins = 18, colour = "gray", alpha = 0.8) +
  scale_fill_gradientn(colours = c("cyan","yellow","red")) +
  guides(alpha = FALSE, size = FALSE) +
  xlim(250, -250) +
  ylim(-52, 418) +
  geom_rug(alpha = 0.5) +
  coord_fixed() +
  ggtitle("Kobe Bryant shots") +
  theme(line = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        legend.title = element_blank(),
        plot.title = element_text(size = 17, lineheight = 1.2, face = "bold"))

enter image description here

DATASET SAMPLE:

data3 <- data.frame(matrix(data=c(-98,-75,-119,83,10,-103,-191,69,196,-21,-106,-127,-180,50,125,200,34,45,99,120,108,184,102,206,113,-3,93,94,164,101,82,146,108,24,56,77,67,200,250,-45,1,0,0,0,1,1,0,0,0,0,1,1,0,1,0,1,1,0,0,1),
                nrow=20,ncol=3))
colnames(data3)<-c("loc_x","loc_y","shot_made_flag")

Upvotes: 2

Views: 134

Answers (1)

R. Schifini
R. Schifini

Reputation: 9313

You should use stat_summary_hex and set fun=mean in order to calculate the effectiveness inside each bin:

# Create random data
set.seed(1)
data3 = data.frame(loc_x = runif(1000,-250,250), 
                   loc_y = rnorm(1000,230,50), 
                   shot_made_flag = rbinom(1000,1,.5))
require(hexbin)
require(ggplot2)

# The first two lines have changed (z = shot_made_flag and using fun = mean)
ggplot(data3, aes(x=loc_x, y=loc_y, z = shot_made_flag)) + 
  stat_summary_hex(fun = mean, bins = 18, colour = "gray", alpha = 0.8) +
  scale_fill_gradientn(colours = c("cyan","yellow","red")) +
  guides(alpha = FALSE, size = FALSE) +
  xlim(250, -250) +
  ylim(-52, 418) +
  geom_rug(alpha = 0.5) +
  coord_fixed() +
  ggtitle("Kobe Bryant shots") +
  theme(line = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        legend.title = element_blank(),
        plot.title = element_text(size = 17, lineheight = 1.2, face = "bold"))

Result: Mean inside each bin

Edited the full answer due to new data and to reflect the desired output (mean inside each hex cell)

Upvotes: 3

Related Questions