Qu Cheng
Qu Cheng

Reputation: 41

Make a tree in R with parents

I am trying to make a tree in R and calculate the distance between 2 nodes.

The data frame to make the tree is like:

tree.source <- data.frame(ID = 1:10, parentID = c(NA,1,1,1,2,2,2,3,4,4))
#ID parentID  
#1    NA  
#2     1  
#3     1  
#4     1  
#5     2  
#6     2  
#7     2   
#8     3  
#9     4  
#10    4

And hope to create a tree structure like this

enter image description here

In addition, I also want to get the distance between 2 nodes. For example, the distance between Node 5 and 10 here is 4, through 5-2-1-4-10. There are 4 edges to link them. And the distance between Node 2 and 8 is 3, through 2-1-3-8.

The tree can be build by using data.tree package with paths for each node, for example, the PathString for Node 10 should be given as 1/4/10, but the PathString can be very long when the number of levels increase. Is there a better way to build the tree?

Upvotes: 1

Views: 804

Answers (2)

Sandipan Dey
Sandipan Dey

Reputation: 23129

Try this (with your given tree.source):

library(igraph)
g <- graph.data.frame(tree.source[-1,2:1], directed = FALSE)
plot(g)

enter image description here

# do a bfs with root as source, you will get distance of each vertex from root
bfs(g, root=1, "out", dist=TRUE)$dist
# 1  2  3  4  5  6  7  8  9 10 
# 0  1  1  1  2  2  2  2  2  2 

# shortest path from 5 to 10
sp <- unlist(shortest_paths(g, 5, 10, mode="out")$vpath)
sp 
# 5  2  1  4 10 
# 5  2  1  4 10 

# distance from 5 to 10 = # vertices on the path - 1
length(sp)-1
# [1] 4

# shortest paths from source node 5 to all
sp_from_5 <- shortest_paths(g, 5, mode="out")$vpath
names(sp_from_5) <- names(V(g))
sp_from_5

# output
$`1`
+ 3/10 vertices, named:
[1] 5 2 1

$`2`
+ 2/10 vertices, named:
[1] 5 2

$`3`
+ 4/10 vertices, named:
[1] 5 2 1 3

$`4`
+ 4/10 vertices, named:
[1] 5 2 1 4

$`5`
+ 1/10 vertex, named:
[1] 5

$`6`
+ 3/10 vertices, named:
[1] 5 2 6

$`7`
+ 3/10 vertices, named:
[1] 5 2 7

$`8`
+ 5/10 vertices, named:
[1] 5 2 1 3 8

$`9`
+ 5/10 vertices, named:
[1] 5 2 1 4 9

$`10`
+ 5/10 vertices, named:
[1] 5  2  1  4  10

Upvotes: 2

Qu Cheng
Qu Cheng

Reputation: 41

The tree can be generated by using:

tree <- as.Node(tree.source[-1,],mode = "network")

as.Node function can generate a tree with a network, which has the first column as "from" and the second as "to" and the following column as the attributes.

And distance(g, 2, 8) can give the distance between Node 2 and 8.

Upvotes: 3

Related Questions