Problems with computing the entropy of random variables

Question

I've been trying to figure out how to calculate the entropy of a random variable X using

sp.stats.entropy()

from the stats package of SciPy, with this random variable X being the returns I obtain from the stock of a specific company ("Company 1") from 1997 to 2012 (this is for a financial data/machine learning assignment). However, the arguments involve inputting the probability values

pk

and so far I'm even struggling with computing the actual empirical probabilities, seeing as I only have the observations of the random variable. I've tried different ways of normalising the data in order to obtain an array of probabilities, but my data contains negative values too, which means that when I try and do

asset1/np.sum(asset1)

where asset1 is the row array of the returns of the stock of "Company 1", I manage to obtain a new array which adds up to 1, but obviously with some negative values, and as we all know, negative probabilities do not exist. Therefore, is there any way of computing the empirical probabilities of my observations occurring again (ideally with the option of choosing specific bins, or for a range of values) on Python?

EDIT: stock returns are considered to be RANDOM VARIABLES, as opposed to the stock prices which are processes. Therefore, the entropy can definitely be applied in this context.

Problems with computing the entropy of random variables

Answers (1)

Related Questions