Reputation: 691
I have a data frame, df like this...df = data.frame(w = c('CT','CT','CT','CT','CT','CT'), x = c('PF','PF','MF','MF','AF','AF'), y = sample(letters, 6), z = seq(1:6))
It is already grouped by w and y. I want to make a new grouping by x, but only if x = PF or MF. I need to keep y if x = AF, otherwise NA or some other unique number would be ok. The summarize function would be the sum of z so the final data frame would be...
w x y z
CT PF NA 3
CT MF NA 7
CT AF s 5
CT AF h 6
I am using dplyr and tried to group_by (Flyway %in% c('MF','PF'))
but that only gets a new column with TRUE/FALSE. Maybe I should be looking outside dplyr? Thanks.
Upvotes: 1
Views: 1353
Reputation: 887851
We could also use data.table
. Convert the 'data.frame' to 'data.table' (setDT(df)
), for values in 'x' that are not 'AF', assign (:=
) the 'y' to 'NA', grouped by 'w', 'x', and 'y', we get the sum
of 'z'.
library(data.table)
setDT(df)[x!='AF', y:=NA_character_][,list(z=sum(z)) ,.(w,x,y)]
# w x y z
#1: CT PF NA 3
#2: CT MF NA 7
#3: CT AF b 5
#4: CT AF o 6
NOTE: The different values in 'y' column is due to not setting the seed while constructing the dataset.
Upvotes: 1
Reputation: 70336
You could change y
first, then group the data and compute the sum of z
:
df %>%
ungroup %>%
mutate(y = replace(y, x != "AF", NA)) %>%
group_by(w, x, y) %>%
summarise(z = sum(z)) %>%
ungroup()
#Source: local data frame [4 x 4]
#
# w x y z
# (fctr) (fctr) (fctr) (int)
#1 CT AF h 5
#2 CT AF l 6
#3 CT MF NA 7
#4 CT PF NA 3
Or a little shorter
df %>%
group_by(w, x, y = replace(y, x != "AF", NA)) %>%
summarise(z = sum(z)) %>%
ungroup()
Upvotes: 3