Reputation: 584
I am planning to use this on network data.
My network has two kinds of edges. I have written a function which returns the indegree for these two edge types separately for you to see what it looks like:
Node G_obs R_obs
1 N1 3 2
2 N2 1 0
3 N3 9 0
4 N4 1 4
5 N5 1 0
...
and I wrote another function which samples the network edges. Here is what it looks like after this:
Node G_obs R_obs
1 N1 4 1
2 N2 1 0
3 N3 3 6
4 N4 3 2
5 N5 1 0
...
Note that the G_obs+R_obs, aka the indegree of the node stays the same.
I'd like to know the pValue for each node to have the originally observed indegree-split between G_obs and R_obs.
EDIT: Sorry - this seemed to be a little too unclear. I don't want the row-wise probability of the observed distribution. I want the probability of the observed G_obs, R_obs split for each node, where sample(G_obs) + sample(R_obs) still have the same sum for node as before. I should consult an English native speaker for better wording next time.. Hope I described the problem more clearly now :(
EDIT 2
observation:
Node G_obs R_obs
1 N1 3 2
2 N2 1 0
3 N3 9 0
4 N4 1 4
5 N5 1 0
as you can see, N1 has 5 in-edges. 3 of them are green (G_obs), 2 of them are red (R_obs)
for the 5 Nodes shown, we have 15 green edges in total and 6 red edges in total. Now we 'sample' all green and all red edges, aka re-distribute them in their assigned column - but at the same time, N1 still has 5 edges. (See example sampling above, where
Node G_obs R_obs
1 N1 4 1
...
I already have a function which provides the 'sampling' correctly (placeholder for this: mySample(graph)
) and need a function which takes mySample, uses it e.g. 1000 times, and calculates how likely the orginal oberservation was for each node.
Any help appreciated Thank you
Upvotes: 0
Views: 118
Reputation: 9123
It sounds like you are after a binomial probability (the probability that randomly dividing the edges between the two types will yield the same distribution as originally observed).
You can compute these probabilities using the dbinom()
function:
transform(
df,
prob_same = dbinom(G_obs, G_obs + R_obs, prob = .5)
)
data
df <- read.table(
text = "
Node G_obs R_obs
N1 3 2
N2 1 0
N3 9 0
N4 1 4
N5 1 0
",
header = TRUE
)
Upvotes: 2