Reputation: 25
I have created the following data frame:
df <- data.frame(names1=c('dog','dog','mouse','cat','dog','cat','cat','dog','mouse','mouse'), names2=c('cat','dog','cat','cat','cat','dog','dog','dog','mouse','cat'), values=c(11,5,41,25,101,78,12,41,6,77))
and this following matrix which yields:
my_matrix <- matrix(data=0,nrow = length(unique(df$names1)),ncol = length(unique(df$names2))) rownames(my_matrix) <- c('dog','mouse','cat') colnames(my_matrix) <- c('dog','mouse','cat')
dog | mouse | cat | |
---|---|---|---|
dog | 0 | 0 | 0 |
mouse | 0 | 0 | 0 |
cat | 0 | 0 | 0 |
Now I want two things.
the result should be:
dog | mouse | cat | |
---|---|---|---|
dog | 2 | 0 | 2 |
mouse | 0 | 1 | 2 |
cat | 2 | 0 | 1 |
Populate the empty matrix with the sums (where element intersect) coming from df$names1 and df$names2
the result should be:
dog | mouse | cat | |
---|---|---|---|
dog | 46 | 0 | 112 |
mouse | 0 | 6 | 118 |
cat | 90 | 0 | 25 |
I have tried something like
for(i in 1:3){ for(j in 1:3){ my_matrix[i,j] <- sum(df$names1 == df$names2) } }
print(my_matrix)
but dont get the desired results
your help is much appreciated, may thanks
update:
thanks for the may comments, but does anybody have an idea how to do it with the for loop. like looping through each element of the matrix and then assigning the value accordingly? so far I have tested something like
for(i in 1:nrow(my_matrix)){
for(j in 1:ncol(my_matrix)){
my_matrix[i,j] <- sum(df$values & df$names1[i] == df$names2[j] & df$names2[j] == df$names1[i])
}
}
print(my_matrix)
but it does not work as desired
Upvotes: 1
Views: 70
Reputation: 5887
updated answer with missing animals
original data
df <- data.frame(names1=c('dog','dog','mouse','cat','dog','cat','cat','dog','mouse','mouse'), names2=c('cat','dog','cat','cat','cat','dog','dog','dog','mouse','cat'), values=c(11,5,41,25,101,78,12,41,6,77))
we use data.table here and make it a data.table
library(data.table)
setDT(df)
define the animals that need to be included in the order you want
o <- c("cat", "dog", "mouse", "bear", "monkey")
now append your data with all animals and set their values to NA
df <- rbindlist(list(df, data.table(names1 = o, names2 = o, values = NA)))
to prevent counting NA's change the fun.aggregate
dcast(df, factor(names1, levels = o) ~ factor(names2, levels = o), fun.aggregate = function(x) length(na.omit(x)), value.var = "values")
names1 cat dog mouse bear monkey
1: cat 1 2 0 0 0
2: dog 2 2 0 0 0
3: mouse 2 0 1 0 0
4: bear 0 0 0 0 0
5: monkey 0 0 0 0 0
dcast(df, factor(names1, levels = o) ~ factor(names2, levels = o), fun.aggregate = function(x) sum(na.omit(x)), value.var = "values")
names1 cat dog mouse bear monkey
1: cat 25 90 0 0 0
2: dog 112 46 0 0 0
3: mouse 118 0 6 0 0
4: bear 0 0 0 0 0
5: monkey 0 0 0 0 0
my original answer
If the output being a matrix is not requirement, you can achieve it easily by going from "long to wide", I use data.table
here with dcast
.
library(data.table)
setDT(df)
dcast(df, names1 ~ names2, fun.aggregate = length)
# names1 cat dog mouse
# 1: cat 1 2 0
# 2: dog 2 2 0
# 3: mouse 2 0 1
dcast(df, names1 ~ names2, fun.aggregate = sum, value.var = "values")
# names1 cat dog mouse
# 1: cat 25 90 0
# 2: dog 112 46 0
# 3: mouse 118 0 6
updated example with ordering
o <- unique(df$names1)
dcast(df, factor(names1, levels = o) ~ factor(names2, levels = o), fun.aggregate = length)
dcast(df, factor(names1, levels = o) ~ factor(names2, levels = o), fun.aggregate = sum, value.var = "values")
Upvotes: 1