CodingGirl
CodingGirl

Reputation: 13

how to deal with and understand Duplicate vertex names error in graph

I have the following task: Build a directed graph representing the invites network: an individual A is connected to individual B if A invited B to the platform.

To solve it, I tried these lines of code:

    all.user <- dt.users[, list(name=unique(user_id), type=FALSE)]
    all.inviter <- dt.users[, list(name=unique(inviter_id), type=TRUE)]
    all.vertices <- rbind(all.user,all.inviter)
    g <- graph.data.frame(dt.users[, list(user_id, inviter_id)],
                      directed=TRUE,
                      vertices=all.vertices)
    g.invites <- bipartite.projection(g)$proj2

It yields me the following error message: Error in graph.data.frame(dt.users[, list(user_id, inviter_id)], directed = TRUE, : Duplicate vertex names

Here is some data of datable dt.users (with only a few rows and the only two relevant columns):

user_id inviter_id
4 NA
5 NA
6 NA
7 NA
8 NA
9 NA
10 NA
11 NA
12 NA
13 NA
14 NA
15 NA
16 NA
17 4
18 4

Because the inviter_id is the user_id of the inviter, there are duplicates. That is how far I understand the problem.

However, I do not know how to eliminate it and would be grateful for anyone's help.

Upvotes: 1

Views: 1100

Answers (1)

ThomasIsCoding
ThomasIsCoding

Reputation: 102529

I think the error is caused by vertices = all.vertices since it has duplicated name = 4 but with different values in the type column. I guess you can try the code below

g <- graph.data.frame(
  dt.users[, list(user_id, replace(inviter_id, is.na(inviter_id), "NA"))],
  directed = TRUE,
  vertices = transform(all.vertices[!duplicated(all.vertices$name)], name = replace(name, is.na(name), "NA"))
)

Upvotes: 1

Related Questions