Reputation: 1
I would like to add totals to my data frame but am having difficulties because the data is quite messy (as ever!) - some columns are text, some dates, some numeric. I can't post the actual data as it is sensitive but I will show a representative example instead which is same structure (below - needed columns are in yellow. I have been trying to do this with dplyr and pipes but run into problems due to mix of text and numbers....
data:
date <- c("17/08/2020", "17/08/2020", "17/08/2020", "17/08/2020","18/08/2020", "18/08/2020", "18/08/2020", "18/08/2020")
type <- c("type A", "type B", "type A", "type B","type A", "type B","type A", "type B")
location <- c("USA","USA","India","India","USA","USA","India","India")
value <- c("10","10","frak","frak","15","15","open","open")
df <- data.frame(date, type, location, value)
Basically, I need total summarised by date, by type and by location. enter image description here
Upvotes: 0
Views: 217
Reputation: 39595
I would suggest next approach, which is also close to the one proposed by @Humpelstielzchen, that is close to what you show in the image:
library(dplyr)
df %>% bind_rows(df %>% group_by(date,location) %>%
mutate(value=as.numeric(value)) %>%
summarise(value=sum(value,na.rm=F)) %>%
mutate(type='total type',value=as.character(value)))
Output:
date type location value
1 17/08/2020 type A USA 10
2 17/08/2020 type B USA 10
3 17/08/2020 type A India frak
4 17/08/2020 type B India frak
5 18/08/2020 type A USA 15
6 18/08/2020 type B USA 15
7 18/08/2020 type A India open
8 18/08/2020 type B India open
9 17/08/2020 total type India <NA>
10 17/08/2020 total type USA 20
11 18/08/2020 total type India <NA>
12 18/08/2020 total type USA 30
Update: Here an approach that could works because of OP'issues with version of package:
library(dplyr)
#Data
date <- c("17/08/2020", "17/08/2020", "17/08/2020", "17/08/2020","18/08/2020", "18/08/2020", "18/08/2020", "18/08/2020")
type <- c("type A", "type B", "type A", "type B","type A", "type B","type A", "type B")
location <- c("USA","USA","India","India","USA","USA","India","India")
value <- c("10","10","frak","frak","15","15","open","open")
df <- data.frame(date, type, location, value,stringsAsFactors = F)
#Mutate for summary
df1 <- df %>% group_by(date,location) %>%
mutate(value=as.numeric(value)) %>%
summarise(value=sum(value,na.rm=F)) %>%
mutate(type='total type') %>% ungroup()
df1$value <- as.character(df1$value)
#Bind
df2 <- rbind(df,df1)
Output:
date type location value
1 17/08/2020 type A USA 10
2 17/08/2020 type B USA 10
3 17/08/2020 type A India frak
4 17/08/2020 type B India frak
5 18/08/2020 type A USA 15
6 18/08/2020 type B USA 15
7 18/08/2020 type A India open
8 18/08/2020 type B India open
9 17/08/2020 total type India <NA>
10 17/08/2020 total type USA 20
11 18/08/2020 total type India <NA>
12 18/08/2020 total type USA 30
Upvotes: 0
Reputation: 6441
Not sure if that's what you're after.
df %>%
group_by(date, type = "total_type", location) %>%
summarise("value" = sum(as.numeric(value), na.rm = F)) %>%
mutate(value = as.character(value)) %>%
bind_rows(df)
# A tibble: 12 x 4
# Groups: date, type [6]
date type location value
<chr> <chr> <chr> <chr>
1 17/08/2020 total_type India NA
2 17/08/2020 total_type USA 20
3 18/08/2020 total_type India NA
4 18/08/2020 total_type USA 30
5 17/08/2020 type A USA 10
6 17/08/2020 type B USA 10
7 17/08/2020 type A India frak
8 17/08/2020 type B India frak
9 18/08/2020 type A USA 15
10 18/08/2020 type B USA 15
11 18/08/2020 type A India open
12 18/08/2020 type B India open
Grouping by all columns but value
reproduces your original table and in your image summarised rows have type = total_type
. On the other hand all your summarised rows in the image have location USA
which doesn't make sense either, so I just let that as it is.
Upvotes: 1