Reputation: 27
I wanted to know how it is possible to use the group_by()
function with being able to keep other columns in the process.
My dataset consists of a lot of different information about different birds and I need to calculate frequencies of 7 behaviors for each day for each bird.
Let's say I subset my dataset for one individual:
alm18subset <- subset(df, df$individualID == "Almen18 (eobs 5861)")
I then have the data for only this individual.
I then want to know the frequencies of the behavior for each day in a new dataframe using group_by:
alm18freq <- alm18subset %>%
group_by(agesincetaggingdays, behaviors) %>%
summarise(n = n()) %>%
mutate(freq = n / sum(n))
which gives me a output of age, behaviors, and then summarise the number of time each behavior is carried out each day and the frequency I'm looking for. My question is, in this process I lose all the other columns using the group_by(). I want to keep some information in my data, like sex or the time a behavior was carried out. This is because I need to do models.
Any idea of how to do it ?
Then I also need to filter according to the behavior:
alm18active <- filter(alm18freq, rf8fitted == "Active")
Thank you !
Upvotes: 0
Views: 94
Reputation: 19211
Replacing summarize
with mutate
should suffice in your case. The subsequent filter
can be done right after this operation.
alm18freq_active <- alm18subset %>%
group_by(agesincetaggingdays, behaviors) %>%
mutate(n = n(), freq = n / sum(n)) %>%
filter(rf8fitted == "Active")
Upvotes: 1
Reputation: 177
You can simply merge the two dataframes that you have by their common columns, this will include the frequency column in the initial subset:
merge(alm18subset, alm18freq)
Upvotes: 1