Reputation: 3
I am trying to insert new rows that would duplicate some data contained in the row but the first column would be unique data that is inserted from an existing column within R.
I am trying to set this data up to be utilized in Tableau and create a network visualization. I don't want my customers entering the data to insert a lot of duplicate data in order to create this visualization.
My current data looks like this:
Connection.ID From To Note
1 1 Niamh MacCallum James Fraser Niamh and James are coworkers
2 2 James Fraser Simon David James and Simon are brothers
3 3 Niamh MacCallum Tom Ashton Niamh recruited Tom to join her organization
This is some fake data I created that replicates my company's data, but the goal is being able to visualize connections between our employees and customers/volunteers they meet and form professional relationships with.
I would like my data to look like this which I export into a csv:
Connection.ID Node.Name Notes
1 1 Niamh MacCallum Niamh and James are coworkers
2 1 James Fraser Niamh and James are coworkers
3 2 James Fraser James and Simon are brothers
4 2 Simon David James and Simon are brothers
5 3 Niamh MacCallum Niamh recruited Tom to join her organization
6 3 Tom Ashton Niamh recruited Tom to join her organization
I've found a couple of resources that create something similar, the best one being this previously-asked question, but it wasn't quite getting to what I needed or I honestly could have been misapplying it (conditionally duplicating rows in a data frame). I thought I could create the same thing while removing the "To" column and renaming "From" to "Node.Name" but I created repetitive data that inserted six copies of each row while also misapplying notes to the wrong connections.
I'd appreciate any help! I'm fairly new to R and self-taught, so if you have a solution or a resource where I can learn the solution that'd be great too. Thanks!
EDIT: Found a similar question I had not seen before, so I am adding it here in case someone else finds this and can reference both of them: Create network files from "classic" dataframe in R - igraph
Upvotes: 0
Views: 59
Reputation: 9313
This is a wide to long transformation that can be done with melt
from the reshape2
package. Do:
df2 = melt(data = df,
id.vars = c("Connection.ID","Note"),
measure.vars = c("From","To"),
variable.name = 'From_To',
value.name = "Node.Name" )
# Remove the unwanted From_To column
df2$From_To = NULL
Result:
> df2
Connection.ID Note Node.Name
1 1 Niamh and James are coworkers Niamh MacCallum
2 2 James and Simon are brothers James Fraser
3 3 Niamh recruited Tom to join her organization Niamh MacCallum
4 1 Niamh and James are coworkers James Fraser
5 2 James and Simon are brothers Simon David
6 3 Niamh recruited Tom to join her organization Tom Ashton
Upvotes: 0