Reputation: 683
Let's say I have the following unweighted, undirected graph where edges can be connected by two different types of edges: support edges (green) and opposition edges (red).
Here's an example:
I want to calculate the "distance" of opposition or support between any given two nodes. For example, if the nodes represented countries at war or political candidates, even though A and D have no edge between them we might conclude that they are likely to be opposed to one another since A is opposed to C and C supports D.
This is a simple example, but given a large graph with many nodes of high degree, how might I determine how likely any two nodes might be opposed to or supporting one another if they cannot be directly connected by a successive chain of opposition/support edges?
I imagine you'd represent each node as a vector whose components where whether an edge of a type exist between any other nodes. If this is a good way to go, what distance measure would you use (Euclidean, Hamming, etc?)
Upvotes: 2
Views: 163
Reputation: 21106
Upvotes: 0
Reputation: 32637
This problem looks as if it needs numerical optimization. Here is an approach:
Introduce a random variable for every node. This variable will be in the range [-1, 1], where -1 means clear opposition and +1 means clear support. Values in between give you the probabilities for either. Fix the variable for the node you're interested in to 1 (hence, it will not be part of the optimization).
Now, define a potential function on the edges. I would suggest the absolute difference:
v1, v2 are the incident nodes' values
for supporting edges:
P(v1, v2) = abs(v1 - v2)
for opposing edges:
P(v1, v2) = abs(v1 + v2)
Depending on your optimization method, you might need a differentiable potential function. You could for example make these functions differentiable with:
for supporting edges:
P(v1, v2) = sqrt((v1 - v2)^2 + epsilon)
for opposing edges:
P(v1, v2) = sqrt((v1 + v2)^2 + epsilon)
where epsilon
is a rather small number.
The sum of all potentials is the energy:
E = Sum P_i
Then, you need to find arg min E
with respect to v
. There are several optimizers. Just try, which one performs best (e.g. gradient descent, simulated annealing, L-BFGS...). If you stick to convex potentials (like the absolute difference), simple gradient-descent is probably enough.
This gives you a support value for every other node. If edges do not contradict each other, all values will be either +1 or -1. If you have contradicting edges, other values are possible as well.
Your example would result in these support values (with respect to A):
B: -0.333 (probably opposing)
C: 0.333 (probably supporting)
D: 0.333 (probably supporting)
Upvotes: 0