moadeep
moadeep

Reputation: 4108

Scaling data in R data frame and fitting gaussian to geom_point

2 questions based on my data.frame

structure(list(Collimator = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L), .Label = c("n", "y"), class = "factor"), angle = c(0L, 
15L, 30L, 45L, 60L, 75L, 90L, 105L, 120L, 135L, 150L, 165L, 180L, 
0L, 15L, 30L, 45L, 60L, 75L, 90L, 105L, 120L, 135L, 150L, 165L, 
180L), X1 = c(2099L, 11070L, 17273L, 21374L, 23555L, 23952L, 
23811L, 21908L, 19747L, 17561L, 12668L, 6008L, 362L, 53L, 21L, 
36L, 1418L, 6506L, 10922L, 12239L, 8727L, 4424L, 314L, 38L, 21L, 
50L), X2 = c(2126L, 10934L, 17361L, 21301L, 23101L, 23968L, 23923L, 
21940L, 19777L, 17458L, 12881L, 6051L, 323L, 40L, 34L, 46L, 1352L, 
6569L, 10880L, 12534L, 8956L, 4418L, 344L, 58L, 24L, 68L), X3 = c(2074L, 
11109L, 17377L, 21399L, 23159L, 23861L, 23739L, 21910L, 20088L, 
17445L, 12733L, 6046L, 317L, 45L, 26L, 46L, 1432L, 6495L, 10862L, 
12300L, 8720L, 4343L, 343L, 38L, 34L, 60L), average = c(2099.6666666667, 
11037.6666666667, 17337, 21358, 23271.6666666667, 23927, 23824.3333333333, 
21919.3333333333, 19870.6666666667, 17488, 12760.6666666667, 
6035, 334, 46, 27, 42.6666666667, 1400.6666666667, 6523.3333333333, 
10888, 12357.6666666667, 8801, 4395, 333.6666666667, 44.6666666667, 
26.3333333333, 59.3333333333)), .Names = c("Collimator", "angle", 
"X1", "X2", "X3", "average"), row.names = c(NA, -26L), class = "data.frame")

I wish to plot detector counts versus angle with and without a collimator attached to the device. I guess geom_point is probably the best way to summarise the data

p <- ggplot(df, aes(x=angle,y=average,col=Collimator)) + geom_point() + geom_line()

Instead of plotting average count in the y-axis, I would prefer to rescale the data so that the angle with max counts has a value 1 for both collimator Y and N. The way I have done this seems quite cumbersome

range01 <- function(x){(x-min(x))/(max(x)-min(x))}
coly = subset(df,Collimator=='y')
coly$norm_count = range01(coly$average)
coln = subset(df,Collimator=='n')
coln$norm_count = range01(coln$average)
df = rbind(coln,coly)
p <- ggplot(df, aes(x=angle,y=norm_count,col=Collimator) + geom_point() + geom_line()

I'm sure this can be done in a more efficient manner, applying the function to the data.frame based on the variable 'Collimator'. How can I do this?

Also I want to fit a function to the data rather than using geom_line. I think a Gaussian function may work in this case but have no idea how/if I can implement this in stat_smooth. Also can I pull out mead/standard deviation from such a fit?

Upvotes: 0

Views: 244

Answers (2)

Brian Diggs
Brian Diggs

Reputation: 58845

joran's answer scales the highest value to 1 and the lowest to 0; if you just want to scale to make the highest value 1 (and leaving 0 as 0), it is even simpler.

library("plyr")
df <- ddply(df, .(Collimator), transform,
            norm.average = average / max(average))

The the plot is

ggplot(df, aes(x=angle,y=norm.average,col=Collimator)) + 
  geom_point() + geom_line()

enter image description here

Upvotes: 1

joran
joran

Reputation: 173627

ggplot2 goes hand in hand with the package plyr:

df <- ddply(df,.(Collimator),
            transform,
            norm_count1 = (average - min(average)) / (max(average) - min(average)) )

Upvotes: 2

Related Questions