Reputation: 9
I am R novice working on a dataframe 'damageData' in RStudio. Brief summary of the data frame:
>str(damageData)
'data.frame': 902297 obs. of 9 variables:
$ EVTYPE : Factor w/ 985 levels " HIGH SURF ADVISORY",..: 834 834 834 834 834 834 834 834 834 834 ...
$ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
$ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
$ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
$ PROPDMGEXP: num 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 ...
$ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
$ CROPDMGEXP: num 0 0 0 0 0 0 0 0 0 0 ...
$ Property : num 25000 2500 25000 2500 2500 2500 2500 2500 25000 25000 ...
$ Crops : num 0 0 0 0 0 0 0 0 0 0 ...
> head(damageData, 10)
EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
1 TORNADO 0 15 25.0 1000 0 0
2 TORNADO 0 0 2.5 1000 0 0
3 TORNADO 0 2 25.0 1000 0 0
4 TORNADO 0 2 2.5 1000 0 0
5 TORNADO 0 2 2.5 1000 0 0
6 TORNADO 0 6 2.5 1000 0 0
7 TORNADO 0 1 2.5 1000 0 0
8 TORNADO 0 0 2.5 1000 0 0
9 TORNADO 1 14 25.0 1000 0 0
10 TORNADO 0 0 25.0 1000 0 0
Property Crops
1 25000 0
2 2500 0
3 25000 0
4 2500 0
5 2500 0
6 2500 0
7 2500 0
8 2500 0
9 25000 0
10 25000 0
I want to group the data frame by EVTYPE. When I use the dplyr package and 'group_by(EVTYPE)' followed by summarize(TotalInjuries=sum(INJURIES), TotalFatalities=sum(FATALITIES)), the data frame does not group by EVTYPE. Instead, I get the following result:
TotalInjuries TotalFatalities 1 140528 15145
I tried changing EVTYPE from 'factor' to 'character' and still get the same result. Please help me troubleshoot this oddity!
Upvotes: 0
Views: 561
Reputation: 3635
It is hard to say exactly what is going on without a reproducible example. You might be using dplyr syntax incorrectly? See below:
damageData <- data.frame(
EVTYPE = factor(c("Y","N","Y","N","Y","N","Y","N","Y","N")),
FATALITIES = c(0,0,0,0,0,0,0,0,1,0),
INJURIES = c(15,0,2,2,2,6,1,0,14,0))
str(damageData)
library(dplyr)
damageData %>%
group_by( EVTYPE ) %>%
summarize( TotalInjuries=sum(INJURIES),
TotalFatalities=sum(FATALITIES))
and I get the following
Source: local data frame [2 x 3]
EVTYPE TotalInjuries TotalFatalities
1 N 8 0
2 Y 34 1
Upvotes: 1