firmo23
firmo23

Reputation: 8404

Gather all the unique values from 2 columns into a new column

I have a dataframe which includes all the nodes' connections in a network and I want to create a new dataframe named 'nodes' with all the unique nodes. Im trying to do something like

eids<-as.factor(d$from)
mids<-as.factor(d$to)
nodes<-data.frame(c(eids,mids))
nodes<-unique(nodes)

but when I try to create the graph I get:Some vertex names in edge list are not listed in vertex data frame which means that part of my data is missed with this method. My dataset is quite large so I put a toy dataset here.

from<-c(2,3,4,3,1,2)
to<-c(8,8,7,5,6,5)
d<-data.frame(from,to)

Upvotes: 0

Views: 48

Answers (1)

Darren Tsai
Darren Tsai

Reputation: 35554

First, to solve your question, you can use unique(stack(d)[1]) to get a data frame with one column with values 1 to 8.


Here I explain why your code doesn't work. Using c() to combine objects of factor class is dangerous. You can try the following example:

(x <- factor(c("A", "B", "C", "D")))
# [1] A B C D
# Levels: A B C D

(y <- factor(c("E", "F", "G", "H")))
# [1] E F G H
# Levels: E F G H

c(x, y)
# [1] 1 2 3 4 1 2 3 4

Actually, the factor object is based on numeric data, not character. You can strip away its class and find that it belongs to a numeric vector with an attribute named levels:

unclass(x)
# [1] 1 2 3 4
# attr(,"levels")
# [1] "A" "B" "C" "D"

The numeric part means the indices of levels. A factor object actually works like recording the indices of its levels.

Upvotes: 1

Related Questions