user3252148
user3252148

Reputation: 153

Sankey Chart with networkD3 - Creating Links

I am trying to develop a sankey chart to visualize a customer journey on a website. My data has two fields: Session_ID and Page_Name. I set a limit to page depth to have a maximum of 6 pages per session.

I was able to create the nodes, but not able to create links. Links has to be of the form (source, target, frequency). Below is my data structure:

test_data = data.frame(session = rep(1:4, each = 4),
                       page = c("a","b","c","d", "a","c","d","e","a","b","d","c","a","d","e","f"))

This should be the final data:

a,b,2
b,c,1
c,d,2
a,c,1
d,e,2
b,d,1
d,c,1
a,d,1
d,f,1

Upvotes: 0

Views: 169

Answers (1)

Marius
Marius

Reputation: 60130

You can do this using dplyr - since the pages are in order of visits, you can use lead() to get the next page:

library(dplyr)

test_data %>%
    group_by(session) %>%
    mutate(next_page = lead(page)) %>%
    ungroup() %>%
    count(page, next_page) %>%
    filter(! is.na(next_page)) 

Upvotes: 2

Related Questions