Tasos
Tasos

Reputation: 7577

python plot and powerlaw fit

I have the following list:

[6, 4, 0, 0, 0, 0, 0, 1, 3, 1, 0, 3, 3, 0, 0, 0, 0, 1, 1, 0, 0, 0, 3, 2, 3, 3, 2, 5, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 2, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 2, 0, 0, 0, 2, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 3, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 2, 2, 3, 2, 1, 0, 0, 0, 1, 2]

I want to plot the frequency of each entity with python and make a powerlaw analysis on it.

But I cannot figure how I can plot the list with ylabel the frequency and xlabel the numbers on the list.

I thought to create a dict with the frequencies and plot the values of the dictionary, but with that way, I cannot put the numbers on xlabel.

Any advice?

Upvotes: 11

Views: 17324

Answers (4)

AcCap
AcCap

Reputation: 175

Use the package: powerlaw

import powerlaw
d=[6, 4, 0, 0, 0, 0, 0, 1, 3, 1, 0, 3, 3, 0, 0, 0, 0, 1, 1, 0, 0, 0, 3,2,  3, 3, 2, 5, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 2, 1, 0, 1, 0, 0, 0, 0, 1,0, 1, 2, 0, 0, 0, 2, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1,3, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 2, 2, 3, 2, 1, 0, 0, 0, 1, 2]
fit = powerlaw.Fit(numpy.array(d)+1,xmin=1,discrete=True)
fit.power_law.plot_pdf( color= 'b',linestyle='--',label='fit ccdf')
fit.plot_pdf( color= 'b')

print('alpha= ',fit.power_law.alpha,'  sigma= ',fit.power_law.sigma)

alpha= 1.85885487521 sigma= 0.0858854875209

enter image description here

It allow to plot, fit and analyse the data correctly. It has as special method for fit on power law distributions with discrete data.

it can be installed with: pip install powerlaw

Upvotes: 13

elyase
elyase

Reputation: 40973

y = np.bincount([6, 4, 0, 0, 0, 0, 0, 1, 3, 1, 0, 3, 3, 0, 0, 0, 0, 1, 1, 0, 0, 0, 3, 2, 3, 3, 2, 5, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 2, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 2, 0, 0, 0, 2, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 3, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 2, 2, 3, 2, 1, 0, 0, 0, 1, 2])
x = np.nonzero(y)[0]
plt.bar(x,y)

enter image description here

Upvotes: 4

unutbu
unutbu

Reputation: 879869

import matplotlib.pyplot as plt
data = [6, 4, 0, 0, 0, 0, 0, 1, 3, 1, 0, 3, 3, 0, 0, 0, 0, 1, 1, 0, 0, 0, 3, 2, 3, 3, 2, 5, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 2, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 2, 0, 0, 0, 2, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 3, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 2, 2, 3, 2, 1, 0, 0, 0, 1, 2]

plt.hist(data, bins=range(max(data)+2))
plt.show()

enter image description here

Upvotes: -1

mgilson
mgilson

Reputation: 309959

I think you're right about the dictionary:

>>> import matplotlib.pyplot as plt
>>> from collections import Counter
>>> c = Counter([6, 4, 0, 0, 0, 0, 0, 1, 3, 1, 0, 3, 3, 0, 0, 0, 0, 1, 1, 0, 0, 0, 3, 2, 3, 3, 2, 5, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 2, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 2, 0, 0, 0, 2, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 3, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 2, 2, 3, 2, 1, 0, 0, 0, 1, 2])
>>> sorted(c.items())
[(0, 50), (1, 30), (2, 9), (3, 8), (4, 1), (5, 1), (6, 1)]
>>> plt.plot(*zip(*sorted(c.items()))
... )
[<matplotlib.lines.Line2D object at 0x36a9990>]
>>> plt.show()

There are a few pieces here that are of interest. zip(*sorted(c.items())) will return something like [(0,1,2,3,4,5,6),(50,30,9,8,1,1,1)]. We can unpack that using the * operator so that plt.plot sees 2 arguments -- (0,1,2,3,4,5,6) and (50,30,9,8,1,1,1). which are used as the x and y values in plotting respectively.

As for fitting the data, scipy will probably be of some help here. Specifically, have a look at the following examples. (one of the examples even uses a power law).

Upvotes: 5

Related Questions