user12353086
user12353086

Reputation:

How to create an adjacency matrix by counting number of co-appearance in a dataframe?

I want to create a network in R. I have a dataframe looks like this. Say Alex has an apple and a banana, Brian has two apple and a Peach, and John has...

Alex    Apple
Alex    Banana
Alex    Kiwi
Brian   Apple
Brian   Apple
Brian   Peach
John    Kiwi
John    Peach
John    Banana
Chris   Melon
Chris   Apple
...

I want use this dataframe to create a non-directed network that use fruit as nodes. If one person has both two different fruits, say John has a peach and kiwi, then there is a edge between node peach and kiwi. the weight of the edge is how many people has both these fruits(nodes).

I'm think about creating an adjacency matrix first, but don't know how to do it. If you have a better idea about creating a different network based on this dataframe, please give me a hint.

Upvotes: 0

Views: 234

Answers (1)

chinsoon12
chinsoon12

Reputation: 25225

Since OP does not have a desired output, assuming that dupes are to be removed, here is an option using combn in data.table:

edges <- unique(DT)[, if (.N > 1L) transpose(combn(Fruit, 2L, simplify=FALSE)), Person][, 
    .N, .(V1, V2)]
library(igraph)
g <- graph_from_data_frame(edges)
set_edge_attr(g, "weight", value=edges$N)
plot(g)
#to check weights, use get.data.frame(g)

edges:

       V1     V2 N
1:  Apple Banana 1
2:  Apple   Kiwi 1
3: Banana   Kiwi 1
4:  Apple  Peach 1
5:   Kiwi  Peach 1
6:   Kiwi Banana 1
7:  Peach Banana 1
8:  Melon  Apple 1

data:

library(data.table)
DT <- fread("Person Fruit
Alex    Apple
Alex    Banana
Alex    Kiwi
Brian   Apple
Brian   Apple
Brian   Peach
John    Kiwi
John    Peach
John    Banana
Chris   Melon
Chris   Apple
Andrew  Apple")

Upvotes: 1

Related Questions