sgdata
sgdata

Reputation: 2763

data.tree nodes through Id's

My data is linked through an Id, ParentId system and I have managed to add correct integer levels, however, I would like to compose a function that automatically nests my 5 tiered hierarchy as a pathString for data.tree.

Structure:

Id                 Name               ParentId           ParentName    Level
701F0000006Iw8E    'Paid Media'       NA                 NA            1
701F0000006IS1t    'Bing ABC'         701F0000006Iw8Y    'Bing'        3    
701F0000006IS28    'Bing DEF'         701F0000006Iw8Y    'Bing'        3
701F0000006IS23    'Bing GHI'         701F0000006Iw8Y    'Bing'        3
701F0000006Imq9    'Bing JKL'         701F0000006Iw8Y    'Bing'        3
701F0000006IS1y    'Bing MNO'         701F0000006Iw8Y    'Bing'        3
701F0000006Iw8Y    'Bing'             701F0000006Iw8E    'Paid Media'  2
701F0000006IvcW    'Google'           701F0000006Iw8E    'Paid Media'  2
7012A000006rhY8    'Adwords ABC'      701F0000006IvcW    'Google'      3
701F0000006IS1j    'Adwords DEF'      701F0000006IvcW    'Google'      3
701F0000006IS1o    'Adwords GHI'      701F0000006IvcW    'Google'      3
701F0000006IS1Z    'Adwords JKL'      701F0000006IvcW    'Google'      3
701F0000006Ieci    'Adwords MNO'      701F0000006IvcW    'Google'      3

Currently, I run into the issue that pathString gets read only by a single tier in the following:

dat$pathString <- paste(dat$ParentId, 
      dat$Id, 
      sep = "/")

Ex.

 "NA/701F0000000SOEq"

Which, in reality to populate the whole tree correctly, I would need to identify all subsequent parents within the string:

 "NA/701F0000006Iw8E/701F0000006Iw8Y/701F0000006IS1t" for "Bing ABC"

Ideally, a single expression will work equivalently for all levels but I understand if each level needs to be handled separately.

Full Id,ParentId system here: Dropbox Link

Upvotes: 2

Views: 1319

Answers (1)

Andrew Lavers
Andrew Lavers

Reputation: 4378

Although your question asks for a path string, the tree can be built directly from your data frame format.

library(data.tree)
dat <- read.table(text="
Id                 Name               ParentId           ParentName    Level
701F0000006Iw8E    'Paid Media'       NA                 NA            1
701F0000006IS1t    'Bing ABC'         701F0000006Iw8Y    'Bing'        2    
701F0000006IS28    'Bing DEF'         701F0000006Iw8Y    'Bing'        2
701F0000006IS23    'Bing GHI'         701F0000006Iw8Y    'Bing'        2
701F0000006Imq9    'Bing JKL'         701F0000006Iw8Y    'Bing'        2
701F0000006IS1y    'Bing MNO'         701F0000006Iw8Y    'Bing'        2
701F0000006Iw8Y    'Bing'             701F0000006Iw8E    'Paid Media'  3
701F0000006IvcW    'Google'           701F0000006Iw8E    'Paid Media'  3
7012A000006rhY8    'Adwords ABC'      701F0000006IvcW    'Google'      2
701F0000006IS1j    'Adwords DEF'      701F0000006IvcW    'Google'      2
701F0000006IS1o    'Adwords GHI'      701F0000006IvcW    'Google'      2
701F0000006IS1Z    'Adwords JKL'      701F0000006IvcW    'Google'      2
701F0000006Ieci    'Adwords MNO'      701F0000006IvcW    'Google'      2
", header=TRUE, stringsAsFactors = F)

# network build does not want a root node as a row, so adjust
# the given root to link to "tree_root"
dat$ParentId[is.na(dat$ParentId)] <- "tree_root"

# build the tree using the network layout - pairs of node ids
# in the first two columns. Remaining columns are node attributes
dat_network <- subset(dat, !is.na(dat$ParentId), c("Id", "ParentId", "Name"))
dat_tree <- FromDataFrameNetwork(dat_network, check = "check")

print(dat_tree, 'Name')

# 1  tree_root                              
# 2   °--701F0000006Iw8E          Paid Media
# 3       ¦--701F0000006Iw8Y            Bing
# 4       ¦   ¦--701F0000006IS1t    Bing ABC
# 5       ¦   ¦--701F0000006IS28    Bing DEF
# 6       ¦   ¦--701F0000006IS23    Bing GHI
# 7       ¦   ¦--701F0000006Imq9    Bing JKL
# 8       ¦   °--701F0000006IS1y    Bing MNO
# 9       °--701F0000006IvcW          Google
# 10          ¦--7012A000006rhY8 Adwords ABC
# 11          ¦--701F0000006IS1j Adwords DEF
# 12          ¦--701F0000006IS1o Adwords GHI
# 13          ¦--701F0000006IS1Z Adwords JKL
# 14          °--701F0000006Ieci Adwords MNO

Upvotes: 3

Related Questions