Reputation: 41
I am trying to resolve this using R but I can't seem to find the correct solution
This is how my data looks:
Carrier Station Month TYSeats LYSeats
AAL BSB 6 10560 10560
AAL BSB 7 10912 10912
AAL BSB 8 10560 9328
AAL BSB 9 9152 7392
AAL BSB 10 9328 9152
AAL BSB 11 8976 10384
AAL BSB 12 10208 10912
AAL CNF 6 12122 12644
AAL CNF 7 12958 13516
AAL CNF 8 10868 10138
AAL CNF 9 5434 5614
AAL CNF 10 5434 7630
AAL CNF 11 8987 9241
AAL CNF 12 12122 12958
I am using this code:
aggregate((TYSeats-LYSeats)/LYSeats~Carrier+Station,data=df,FUN=mean)
The solution I would have expected would have looked something like this (which is (sum(TYSeats) - sum(LYSeats))
over sum(LYSeats)
):
1 AAL BSB 0.015385
2 AAL CNF -0.053191
But I am getting this instead (it is averaging each operation for each month)
1 AA BSB 0.0270417328
2 AA CNF -0.0603483997
Is there a way to accomplish what I need in a simple line/command?
Thanks!
Upvotes: 2
Views: 1468
Reputation: 2861
df.new <- group_by(Carrier, Station) %>%
mutate(Max = max(TYSeats, LYSeats),
Min = min(TYSeats, LYSeats),
Diff.per = Max/Min -1)
you can see the positive percantage changes
Upvotes: 0
Reputation: 13827
A simple and fast data.table
solution.
library(data.table)
setDT(df)
df[ , .(PercentChange = sum(TYSEATs -LYSeats)/sum(LYSEATs)) , by = .(Carrier, Station) ]
Upvotes: 1
Reputation: 887991
We can use dplyr
library(dplyr)
df1 %>%
group_by(Carrier, Station) %>%
summarise(PercentChange = (sum(TYSeats) - sum(LYSeats))/sum(LYSeats))
# Carrier Station PercentChange
# <chr> <chr> <dbl>
#1 AAL BSB 0.01538462
#2 AAL CNF -0.05319134
Upvotes: 2
Reputation: 2489
Probably worth noting that if is actually the percentage you are after, you should multiply by 100. Using @Psidom's code:
ddply(df, .(Carrier, Station), summarise,
PerentChange = ((sum(TYSeats) - sum(LYSeats))/sum(LYSeats)*100))
Carrier Station PerentChange
AAL BSB 1.538462
AAL CNF -5.319134
For example, 1/4 is 25%, but
> 1/4
[1] 0.25
Upvotes: 0
Reputation: 215137
You can also use the ddply
function from plyr
package:
library(plyr)
ddply(df, .(Carrier, Station), summarise,
PerentChange = (sum(TYSeats) - sum(LYSeats))/sum(LYSeats))
Carrier Station PerentChange
1 AAL BSB 0.01538462
2 AAL CNF -0.05319134
Upvotes: 1