Reputation: 2684
I'm trying to implement an algorithm for generating graphs following Barabási-Albert (BA) model. Under this model, the degree distribution follows a power-law:
P(k) ~ k^-λ
Where the exponent λ should equal 3.
For simplicity, I will focus here on the R code, where I'm using igraph
functions. However I get networks with λ != 3. It seems that this has been a topic extensively covered (example question 1, eq2, eq3), but I haven't been able to find a satisfactory solution.
In R I use igraph:::sample_pa
function to generate a graph following the BA model. In the reproducible example below, I set
# Initialize
set.seed(1234)
order = 100
v_degrees = vector()
for (i in 1:10000) {
g <- sample_pa(order, power=3, m=8)
# Get degree distribution
d = degree(g, mode="all")
dd = degree_distribution(g, mode="all", cumulative=FALSE)
d = 1:max(d)
probability = dd[-1]
nonzero.position = which(probability !=0)
probability = probability[nonzero.position]
d = d[nonzero.position]
# Fit power law distribution and get gamma exponent
reg = lm (log(probability) ~ log(d))
cozf = coef(reg)
power.law.fit = function(x) exp(cozf[[1]] + cozf[[2]] * log(x))
gamma = -cozf[[2]]
v_degrees[i] = gamma
}
The graph seems to be scale free in fact, giving gamma=0.72±0.21 with order 100 and gamma=0.68±0.24 for order 10,000, and similar results varying the parameter m. But the exponent is clearly different from the expected gamma=3.
In fact I was trying to implement this model on a different language (C++, see code below), but I get similar results with exponents lower than 3. So I wonder if this is a common misunderstanding on the BA model or there's something wrong in the previous calculations fitting the power law distribution, of it contrarily to what is commonly expected this is the normal behavior of the BA model.
In case someone is interested or is more familiarized with C++, see appendix below.
Appendix: C++ code
For understanding the code below, assume an object class Graph
, and a connect
function that created an edge between two vertices passed as argument. Below I give code of two relevant functions BA_step and build_BA.
BA_step
void Graph::BA_step (int ID, int m, std::vector<double>& freqs) {
std::vector<int> connect_history;
vertices.push_back(ID);
// Connect node ID to a random node i with pi ~ ki / sum kj
while (connect_history.size() < m) {
double U (sample_prob()); // gets a value in the range [0,1)
int index (freqs[freqs.size()-1]);
for (int i(0); i<freqs.size(); ++i) {
if (U<=freqs[i]/index && !is_in(connect_history, i)) { // is_in checks if i exists in connect_history
connect(ID, i);
connect_history.push_back(i);
break;
}
}
}
// Update vector of absolute edge frequencies
for (int i(0); i<connect_history.size(); ++i) {
int index (connect_history[i]);
for (int j(index); j<freqs.size(); ++j) {
++freqs[j];
}
}
freqs.push_back(m+freqs[freqs.size()-1]);
}
build_BA
void Graph::build_BA (int m0, int m) {
// Initialization
std::vector<double> cum_nedges;
std::vector<int> connect_history;
for (int ID(0); ID<m0; ++ID) {
vertices.push_back(ID);
}
// Initial BA step
vertices.push_back(m0);
for (int i(0); i<m; ++i) {
connect(m0, i);
connect_history.push_back(i);
}
cum_nedges.push_back(1);
for (int i(1); i<m; ++i) cum_nedges.push_back(cum_nedges[cum_nedges.size()-1]+1);
cum_nedges.push_back(m+m);
// BA model
for (int ID(m0+1); ID<order; ++ID) {
BA_step(ID, m, cum_nedges);
}
}
Upvotes: 0
Views: 942
Reputation: 1648
Two things might help:
sample_pa
arguments to get exponent alpha = 3
Are really power = 1
and m = 1
(check definition in that wikipedia article against the igraph::sample_pa documentation---the power
argument doesn't mean the degree of the power-law distribution).
Just running OLS/LM on the degree distribution gives you an exponent closer to 0 than 3 (underestimated, in other words). Instead, if you use the igraph::power_law_fit
command with a high xmin
, you'll get answers closer to 3. Check Aaron Clauset's page and publications for more info on estimating power laws. Really you need to estimate an optimal x-min for every degree distribution.
Here's some code that'll work a bit better:
library(igraph)
set.seed(1234)
order = 10000
v_degrees = vector()
for (i in 1:100) {
g <- sample_pa(order, power = 1, m = 1)
d <- degree(g, mode="all")
v_degrees[i] <- fit_power_law(d, ceiling(mean(d))+100) %>% .$alpha
}
v_degrees %>% summary()
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.646 2.806 2.864 2.873 2.939 3.120
Note that I make up the x-min to use (ceiling(mean(d))+100
). Changing that will change your answers.
Upvotes: 1