Rohit
Rohit

Reputation: 6020

Create random numbers with left skewed probability distribution

I would like to pick a number randomly between 1-100 such that the probability of getting numbers 60-100 is higher than 1-59.

I would like to have the probability to be a left-skewed distribution for numbers 1-100. That is to say, it has a long tail and a peak.

Something along the lines:

pers = np.arange(1,101,1)
prob = <left-skewed distribution>
number = np.random.choice(pers, 1, p=prob)

I do not know how to generate a left-skewed discrete probability function. Any ideas? Thanks!

Upvotes: 11

Views: 16327

Answers (4)

JeeyCi
JeeyCi

Reputation: 579

can take any needed distribution from numpy.random or scipy.stats and do needed transformations or inversion as you like.

from scipy.stats import skewnorm
import matplotlib.pyplot as plt

fig, ax = plt.subplots(1, 1)
a= -10 # !! use negative a
r = skewnorm.rvs(a, size=1000)
ax.hist(r, density=True, bins='auto', histtype='stepfilled', alpha=0.2)
plt.show()

or e.g. lognormal multiply for -1 to reflect sample & shift to the right from zero at x-axis -- TO GET negatively skewed (LEFT skew) distribution that has a long tail on the left side.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

shift= 100
mu, sigma = 3., 0.5 # mean and standard deviation
s = -1* np.random.lognormal(mu, sigma, 1000) + shift

##Display the histogram of the samples,
import matplotlib.pyplot as plt
count, bins, ignored = plt.hist(s, 100,  align='mid')
plt.axis('tight')
plt.show()

Upvotes: 0

Aaron Horvitz
Aaron Horvitz

Reputation: 186

This is the answer you are looking for using the SciPy function 'skewnorm'. It can make any positive set of integers either left or rightward skewed.

from scipy.stats import skewnorm
import matplotlib.pyplot as plt

numValues = 10000
maxValue = 100
skewness = -5   #Negative values are left skewed, positive values are right skewed.

random = skewnorm.rvs(a = skewness,loc=maxValue, size=numValues)  #Skewnorm function

random = random - min(random)      #Shift the set so the minimum value is equal to zero.
random = random / max(random)      #Standadize all the vlues between 0 and 1. 
random = random * maxValue         #Multiply the standardized values by the maximum value.

#Plot histogram to check skewness
plt.hist(random,30,density=True, color = 'red', alpha=0.1)
plt.show()

Please reference the documentation here: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.skewnorm.html

Histogram of left-skewed distribution

The code generates the following plot.

enter image description here

Upvotes: 13

Ryan
Ryan

Reputation: 231

The p argument of np.random.choice is the probability associated with each element in the array in the first argument. So something like:

    np.random.choice(pers, 1, p=[0.01, 0.01, 0.01, 0.01, ..... , 0.02, 0.02])

Where 0.01 is the lower probability for 1-59 and 0.02 is the higher probability for 60-100.

The SciPy documentation has some useful examples.

http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.random.choice.html

EDIT: You might also try this link and look for a distribution (about half way down the page) that fits the model you are looking for.

http://docs.scipy.org/doc/scipy/reference/stats.html

Upvotes: 3

nicolas
nicolas

Reputation: 3280

Like you described, just make sure your skewed-distribution adds up to 1.0:

pers = np.arange(1,101,1)

# Make each of the last 41 elements 5x more likely
prob = [1.0]*(len(pers)-41) + [5.0]*41

# Normalising to 1.0
prob /= np.sum(prob)

number = np.random.choice(pers, 1, p=prob)

Upvotes: 5

Related Questions