Reputation: 2763
My data is linked through an Id
, ParentId
system and I have managed to add correct integer levels
, however, I would like to compose a function that automatically nests my 5 tiered hierarchy as a pathString
for data.tree
.
Structure:
Id Name ParentId ParentName Level
701F0000006Iw8E 'Paid Media' NA NA 1
701F0000006IS1t 'Bing ABC' 701F0000006Iw8Y 'Bing' 3
701F0000006IS28 'Bing DEF' 701F0000006Iw8Y 'Bing' 3
701F0000006IS23 'Bing GHI' 701F0000006Iw8Y 'Bing' 3
701F0000006Imq9 'Bing JKL' 701F0000006Iw8Y 'Bing' 3
701F0000006IS1y 'Bing MNO' 701F0000006Iw8Y 'Bing' 3
701F0000006Iw8Y 'Bing' 701F0000006Iw8E 'Paid Media' 2
701F0000006IvcW 'Google' 701F0000006Iw8E 'Paid Media' 2
7012A000006rhY8 'Adwords ABC' 701F0000006IvcW 'Google' 3
701F0000006IS1j 'Adwords DEF' 701F0000006IvcW 'Google' 3
701F0000006IS1o 'Adwords GHI' 701F0000006IvcW 'Google' 3
701F0000006IS1Z 'Adwords JKL' 701F0000006IvcW 'Google' 3
701F0000006Ieci 'Adwords MNO' 701F0000006IvcW 'Google' 3
Currently, I run into the issue that pathString gets read only by a single tier in the following:
dat$pathString <- paste(dat$ParentId,
dat$Id,
sep = "/")
Ex.
"NA/701F0000000SOEq"
Which, in reality to populate the whole tree correctly, I would need to identify all subsequent parents within the string:
"NA/701F0000006Iw8E/701F0000006Iw8Y/701F0000006IS1t" for "Bing ABC"
Ideally, a single expression will work equivalently for all levels but I understand if each level needs to be handled separately.
Full Id,ParentId system here: Dropbox Link
Upvotes: 2
Views: 1319
Reputation: 4378
Although your question asks for a path string, the tree can be built directly from your data frame format.
library(data.tree)
dat <- read.table(text="
Id Name ParentId ParentName Level
701F0000006Iw8E 'Paid Media' NA NA 1
701F0000006IS1t 'Bing ABC' 701F0000006Iw8Y 'Bing' 2
701F0000006IS28 'Bing DEF' 701F0000006Iw8Y 'Bing' 2
701F0000006IS23 'Bing GHI' 701F0000006Iw8Y 'Bing' 2
701F0000006Imq9 'Bing JKL' 701F0000006Iw8Y 'Bing' 2
701F0000006IS1y 'Bing MNO' 701F0000006Iw8Y 'Bing' 2
701F0000006Iw8Y 'Bing' 701F0000006Iw8E 'Paid Media' 3
701F0000006IvcW 'Google' 701F0000006Iw8E 'Paid Media' 3
7012A000006rhY8 'Adwords ABC' 701F0000006IvcW 'Google' 2
701F0000006IS1j 'Adwords DEF' 701F0000006IvcW 'Google' 2
701F0000006IS1o 'Adwords GHI' 701F0000006IvcW 'Google' 2
701F0000006IS1Z 'Adwords JKL' 701F0000006IvcW 'Google' 2
701F0000006Ieci 'Adwords MNO' 701F0000006IvcW 'Google' 2
", header=TRUE, stringsAsFactors = F)
# network build does not want a root node as a row, so adjust
# the given root to link to "tree_root"
dat$ParentId[is.na(dat$ParentId)] <- "tree_root"
# build the tree using the network layout - pairs of node ids
# in the first two columns. Remaining columns are node attributes
dat_network <- subset(dat, !is.na(dat$ParentId), c("Id", "ParentId", "Name"))
dat_tree <- FromDataFrameNetwork(dat_network, check = "check")
print(dat_tree, 'Name')
# 1 tree_root
# 2 °--701F0000006Iw8E Paid Media
# 3 ¦--701F0000006Iw8Y Bing
# 4 ¦ ¦--701F0000006IS1t Bing ABC
# 5 ¦ ¦--701F0000006IS28 Bing DEF
# 6 ¦ ¦--701F0000006IS23 Bing GHI
# 7 ¦ ¦--701F0000006Imq9 Bing JKL
# 8 ¦ °--701F0000006IS1y Bing MNO
# 9 °--701F0000006IvcW Google
# 10 ¦--7012A000006rhY8 Adwords ABC
# 11 ¦--701F0000006IS1j Adwords DEF
# 12 ¦--701F0000006IS1o Adwords GHI
# 13 ¦--701F0000006IS1Z Adwords JKL
# 14 °--701F0000006Ieci Adwords MNO
Upvotes: 3