user2519890
user2519890

Reputation: 157

Creating a discrete cumulative in python

I am trying to create a cumulative distribution in python but keep getting AttributeError, my code reads:

import sys
import scipy.stats
import numpy 

def CDF_Random(N,NE,E,SE,S,SW,W,NW,Iterations):
    WindDir = [0,45,90,135,180,225,270,315]
    Freq = [N,NE,E,SE,S,SW,W,NW]

    cdf=scipy.stats.rv_discrete.cdf(WindDir,Freq)  
    cdf_rand=cdf.rvs(size=Iterations)    
    return (cdf_rand)

if __name__ == '__main__':
    N = float(sys.argv[1])
    NE = float(sys.argv[2])
    E = float(sys.argv[3])
    SE = float(sys.argv[4])
    S = float(sys.argv[5])
    SW = float(sys.argv[6])
    W = float(sys.argv[7])
    NW = float(sys.argv[8])
    Iterations = float(sys.argv[9])
    numpy.set_printoptions(threshold=Iterations)
    sys.stdout.write(str(CDF_Random(N,NE,E,SE,S,SW,W,NW,Iterations)))

The error I get depends on the values I use for WindDir amd Freq, sometimes they are arrays as shown in the code above some times one of them is a single integer or they both are or one maybe a number between 0 and 1.

AttributeError: 'int' object has no attribute '_fix_loc'

or

AttributeError: 'list' object has no attribute '_fix_loc'

or

AttributeError: 'float' object has no attribute '_fix_loc'

I have trawled through google searches and this website but i am having no luck, I also spent a long time varying my inputs and using the python website.

Edit Inputs I have tried: Note that the code needs to be edited for some inputs as the length of the input array varies. These are all run through Command Prompt

python C:\Users\...\python\CDF.py 0.01 0.02 0.03 0.4 0.98 0.99 1 5

this gives this error

AttributeError: 'list' object has no attribute '_fix_loc'

After editing the code import sys import scipy.stats import numpy

def CDF_Random():

    cdf=scipy.stats.rv_discrete.cdf(5,1)

   cdf_rand=cdf.rvs(size=Iterations)    
    return (cdf_rand)

    return (cdf)

if __name__ == '__main__':

    sys.stdout.write(str(CDF_Random()))

The following error is returned

AttributeError: 'int' object has no attribute '_fix_loc'

def CDF_Random():

    cdf=scipy.stats.rv_discrete.cdf(0.5,1)

   cdf_rand=cdf.rvs(size=Iterations)    
    return (cdf_rand)

    return (cdf)

if __name__ == '__main__':

    sys.stdout.write(str(CDF_Random()))

This error occurs

AttributeError: 'float' object has no attribute '_fix_loc'

I also tried other combinations for example arrays as the first variable and intergers and floats in as the second variable.

cdf=scipy.stats.rv_discrete.cdf([array],0.5)
cdf=scipy.stats.rv_discrete.cdf([array],[array])
cdf=scipy.stats.rv_discrete.cdf(4,[array])
cdf=scipy.stats.rv_discrete.cdf([array],5)

Upvotes: 3

Views: 1806

Answers (1)

seth
seth

Reputation: 1788

scipy.stats.rv_discrete.cdf evaluates your distribution at some listed quantiles. You have to make your distribution first. Try:

mydist = scipy.stats.rv_discrete(name = 'mydistribution', values=(WindDir,Freq))

note: Freq should actually be probabilities and sum to 1, so you should divide each member by the sum of Freq before passing it to .rv_discrete.

More explicitly, this code returns Iteration random variables from the distribution you make with WindDir and Freq. (although I changed names slightly because I didn't like using sysargs for testing).

import sys
import scipy.stats
import numpy 
import random

def CDF_Random(probs,Iterations):
    WindDir = [0,45,90,135,180,225,270,315]
    Freq = probs
    mydist = scipy.stats.rv_discrete(name = 'mydistribution', values=(WindDir,Freq))  
    cdf_rand=mydist.rvs(size=Iterations)    
    #cdf=scipy.stats.rv_discrete.cdf(cdf_rand,[.5,1,10,50,99])
    return (cdf_rand)

if __name__ == '__main__':
    probs = [random.randint(1,10) for _ in xrange(8)]
    probs = [float(p)/sum(probs) for p in probs]
    Iterations = 30
    numpy.set_printoptions(threshold=Iterations)
    a=CDF_Random(probs,Iterations)

gives:

>>> a
array([  0, 270, 180, 180,   0, 180,  45,  45, 270, 270, 270,   0,  45,
        45, 180,  45, 180, 180, 270, 225,  45, 180, 270, 315, 225,  45,
       180, 180,   0,   0])

If you want to evaluate the cdf of your distribution then use mydist.cdf([array of percentiles to evaluate at here])

i.e.

>>> mydist.cdf([1,10,25,50,75,99])
array([ 0.1627907 ,  0.1627907 ,  0.1627907 ,  0.30232558,  0.30232558,
        0.39534884])

More thorough information can be found at the documentation. As well as looking at the doc string of your rv_discrete instance. i.e. print mydist.__doc__.

Upvotes: 3

Related Questions