Reputation: 866
I have several products in a market and the customers tend to switch between these products. I need to calculate the net gain/loss of customers switching between two products such that the dynamics can be visualised in a visNetwork graph.
A snippet of my dataset:
> dput(df)
structure(list(value = c(2.5, 5, 20, 113, 25, 43.5, 25.5, 2.5,
5, 22.5, 17.5, 32, 65, 7.5, 10, 45.5, 12.5, 10, 5, 37, 35, 20.5,
10, 5, 7.5), source = structure(c(1L, 1L, 2L, 2L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L,
6L, 6L), .Label = c("A", "B", "C", "D", "E", "F"), class = "factor"),
target = structure(c(2L, 3L, 1L, 3L, 4L, 5L, 6L, 7L, 1L,
2L, 4L, 5L, 6L, 7L, 2L, 3L, 5L, 6L, 7L, 2L, 3L, 4L, 6L, 3L,
5L), .Label = c("A", "B", "C", "D", "E", "F", "G"), class = "factor")), .Names = c("value",
"source", "target"), row.names = c(NA, -25L), class = "data.frame")
> head(df,10)
value source target
1 2.5 A B
2 5.0 A C
3 20.0 B A
4 113.0 B C
5 25.0 B D
6 43.5 B E
7 25.5 B F
8 2.5 B G
9 5.0 C A
10 22.5 C B
Notice that not every product has to loss/gain customers.
In the above dataset Product A losses 2.5 customers to Product B and Product B losses 20 customers to Product A. Then Product A would have a net gain of 17.5 customers and Product B a net loss of 2.5 customers. I would like to make this calculation for all products using dplyr, since I make heavy use of dplyr in other parts of the analysis.
The resulting dataframe could have the following structure:
from to value
1 B A 17.5
Please disregard the fact that I have half customers :)
Upvotes: 2
Views: 114
Reputation:
Using dplyr :
mutate(data,new_value=apply(data,1,function(vec){ max(data[data$source==vec[3] & data$target==vec[2],"value"],0)})-value)
Using data table:
setDT(data)
data[,new_value:=apply(data,1,function(vec){ max(data[data$source==vec[3] & data$target==vec[2]]$value,0)})-value]
If you want to remove the previous values and have a final result:
mutate(data,value=apply(data,1,function(vec){ max(data[data$source==vec[3] & data$target==vec[2],"value"],0)})-value)[,c(3,2,1)]
Upvotes: 1
Reputation: 1675
It will not make use of dplyr
, but you could use acast
to create a matrix and substract one triangle from the other
library("reshape2")
df.mat <- acast(df, source ~ target)
df.mat.u <- df.mat[upper.tri(df.mat)]
df.mat.l <- df.mat[lower.tri(df.mat)]
df.mat.l - df.mat.u
For this to work the matrix must be symmetric which it isn't in this case.
Using igraph you can get a symmetric matrix: Reconstruct symmetric matrix from values in long-form
Upvotes: 0