Reproducing sumproduct in R into matrix

Question

I have created the following data frame:

df <- data.frame(names1=c('dog','dog','mouse','cat','dog','cat','cat','dog','mouse','mouse'), names2=c('cat','dog','cat','cat','cat','dog','dog','dog','mouse','cat'), values=c(11,5,41,25,101,78,12,41,6,77))

and this following matrix which yields:

my_matrix <- matrix(data=0,nrow = length(unique(df$names1)),ncol = length(unique(df$names2))) rownames(my_matrix) <- c('dog','mouse','cat') colnames(my_matrix) <- c('dog','mouse','cat')

	dog	mouse	cat
dog	0	0	0
mouse	0	0	0
cat	0	0	0

Now I want two things.

Populate the empty matrix with the counts (number of element intersects) coming from df$names1 and df$names2

the result should be:

	dog	mouse	cat
dog	2	0	2
mouse	0	1	2
cat	2	0	1

Populate the empty matrix with the sums (where element intersect) coming from df$names1 and df$names2

the result should be:

	dog	mouse	cat
dog	46	0	112
mouse	0	6	118
cat	90	0	25

I have tried something like

for(i in 1:3){ for(j in 1:3){ my_matrix[i,j] <- sum(df$names1 == df$names2) } }

print(my_matrix)

but dont get the desired results

your help is much appreciated, may thanks

update:

thanks for the may comments, but does anybody have an idea how to do it with the for loop. like looping through each element of the matrix and then assigning the value accordingly? so far I have tested something like

for(i in 1:nrow(my_matrix)){
  for(j in 1:ncol(my_matrix)){
    
    my_matrix[i,j] <-  sum(df$values & df$names1[i] == df$names2[j] & df$names2[j] == df$names1[i]) 
  }
}

print(my_matrix)

but it does not work as desired

Merijn van Tilborg · Accepted Answer

updated answer with missing animals

original data

df <- data.frame(names1=c('dog','dog','mouse','cat','dog','cat','cat','dog','mouse','mouse'), names2=c('cat','dog','cat','cat','cat','dog','dog','dog','mouse','cat'), values=c(11,5,41,25,101,78,12,41,6,77))

we use data.table here and make it a data.table

library(data.table)
setDT(df)

define the animals that need to be included in the order you want

o <- c("cat", "dog", "mouse", "bear", "monkey")

now append your data with all animals and set their values to NA

df <- rbindlist(list(df, data.table(names1 = o, names2 = o, values = NA)))

to prevent counting NA's change the fun.aggregate

dcast(df, factor(names1, levels = o) ~ factor(names2, levels = o), fun.aggregate = function(x) length(na.omit(x)), value.var = "values")

   names1 cat dog mouse bear monkey
1:    cat   1   2     0    0      0
2:    dog   2   2     0    0      0
3:  mouse   2   0     1    0      0
4:   bear   0   0     0    0      0
5: monkey   0   0     0    0      0

dcast(df, factor(names1, levels = o) ~ factor(names2, levels = o), fun.aggregate = function(x) sum(na.omit(x)), value.var = "values")

   names1 cat dog mouse bear monkey
1:    cat  25  90     0    0      0
2:    dog 112  46     0    0      0
3:  mouse 118   0     6    0      0
4:   bear   0   0     0    0      0
5: monkey   0   0     0    0      0

my original answer

If the output being a matrix is not requirement, you can achieve it easily by going from "long to wide", I use data.table here with dcast.

library(data.table)

setDT(df)

dcast(df, names1 ~ names2, fun.aggregate = length)

#    names1 cat dog mouse
# 1:    cat   1   2     0
# 2:    dog   2   2     0
# 3:  mouse   2   0     1

dcast(df, names1 ~ names2, fun.aggregate = sum, value.var = "values")

#    names1 cat dog mouse
# 1:    cat  25  90     0
# 2:    dog 112  46     0
# 3:  mouse 118   0     6

updated example with ordering

o <- unique(df$names1)
dcast(df, factor(names1, levels = o) ~ factor(names2, levels = o), fun.aggregate = length)
dcast(df, factor(names1, levels = o) ~ factor(names2, levels = o), fun.aggregate = sum, value.var = "values")

Reproducing sumproduct in R into matrix

Answers (1)

Related Questions