Reputation: 173
I have a dataset with persons' spent time for various projects by month and category, smth like this:
person | project | date | time
--------------------------------
A | a | Jan | 1
A | b | Jan | 2
A | c | Jan | 3
A | d | Feb | 1
B | a | Feb | 2
B | b | Feb | 3
B | c | Feb | 1
--------------------------------
I need to have a summary by person by date with total time spent and part of the time spent on one of projects (let's say "a"), i.e.:
person | date | Total | project:a
--------------------------------
A | Jan | 6 | 1
A | Feb | 1 | 0
B | Jan | 0 | 0
B | Feb | 6 | 2
--------------------------------
I have a small code that I found in different similar questions, but that don't give correct results:
data %>% group_by(person, date) %>% summarise(total = sum(time), `project:a` = sum(time[project == "a"]))
It calculates correctly the total
sum, but not the sum with condition - it mostly returns NA
. What can be the issue? Thanks.
Upvotes: 0
Views: 864
Reputation: 887901
We can use type_convert
from readr
library(dplyr)
library(readr)
df %>%
type_convert %>%
group_by(person, date) %>%
summarise(Total = sum(time), project_a = sum(time[project == "a"]))
Upvotes: 1
Reputation: 389275
Try using type.convert
if you have factor columns.
df %>%
type.convert %>%
group_by(person, date, .drop = FALSE) %>%
summarise(Total = sum(time), project_a = sum(time[project == "a"]))
# person date Total project_a
# <fct> <fct> <int> <int>
#1 A Feb 1 0
#2 A Jan 6 1
#3 B Feb 6 2
#4 B Jan 0 0
Upvotes: 3