Reputation: 537
I want to sum
all values in a row of a dataframe after performing a dcast
operation from the reshape2
package. Problem is that all values are the same (10) and are the sum of all rows combined. Values should be 4,2,4
Example data with code:
df <- data.frame(x = as.factor(c("A","A","A","A","B","B","C","C","C","C")),
y = as.factor(c("AA","AB","AA","AC","BB","BA","CC","CC","CC","CD")),
z = c("var1","var1","var2","var1","var2","var1","var1","var2","var2","var1"))
df2 <- df %>%
group_by(x,y) %>%
summarise(num = n()) %>%
ungroup()
df3 <- dcast(df2,x~y, fill = 0 )
df3$total <- sum(df3$AA,df3$AB,df3$AC,df3$BA,df3$BB,df3$CC,df3$CD)
Upvotes: 0
Views: 114
Reputation: 887153
We can specify the values_fn
in pivot_wider
and also use adorn_totals
from janitor
library(dplyr)
library(tidyr)
library(janitor)
df %>%
pivot_wider(names_from = y, values_from = z, values_fill = 0,
values_fn = length) %>%
adorn_totals("col")
-output
# x AA AB AC BB BA CC CD Total
# A 2 1 1 0 0 0 0 4
# B 0 0 0 1 1 0 0 2
# C 0 0 0 0 0 3 1 4
Or using base R
with xtabs
and addmargins
addmargins(xtabs(z ~ x + y, transform(df, z = 1)), 2)
# y
#x AA AB AC BA BB CC CD Sum
# A 2 1 1 0 0 0 0 4
# B 0 0 0 1 1 0 0 2
# C 0 0 0 0 0 3 1 4
Upvotes: 1
Reputation: 388982
sum
gives you 1 combined value and that value is repeated for all other rows.
sum(df3$AA,df3$AB,df3$AC,df3$BA,df3$BB,df3$CC,df3$CD)
#[1] 10
You need rowSums
to get sum of each row separately.
df3$total <- rowSums(df3[-1])
Here is a simplified tidyverse
approach starting from df
:
library(dplyr)
library(tidyr)
df %>%
count(x, y, name = 'num') %>%
pivot_wider(names_from = y, values_from = num, values_fill = 0) %>%
mutate(total = rowSums(select(., AA:CD)))
# x AA AB AC BA BB CC CD total
# <fct> <int> <int> <int> <int> <int> <int> <int> <dbl>
#1 A 2 1 1 0 0 0 0 4
#2 B 0 0 0 1 1 0 0 2
#3 C 0 0 0 0 0 3 1 4
Upvotes: 1