gnarls
gnarls

Reputation: 23

Why do scipy.stats.rv_discrete mean and expect show unexpected difference?

I have a scipy stats rv_discrete distribution where the expect method with the defaults fails to match the mean.

When I try with a toy example, it works as expected:

from scipy.stats import rv_discrete
x = (1.0, 3.0, 4.0, 6.0)
px = (0.1, 0.6, 0.2, 0.1)
dx = rv_discrete(values=(x, px))

print(dx.mean(), dx.expect(), dx.expect(lb=5.0, conditional=True))

Gives:

3.3000000000000003 3.3000000000000003 5.999999999999996

y = [200.0, 300.0, 400.0, 500.0, 600.0, 700.0, 800.0, 900.0, 1000.0, 1100.0, 1200.0, 1300.0, 1400.0, 1500.0, 1600.0, 1700.0, 1800.0, 1900.0, 2000.0, 2100.0, 2200.0, 2300.0, 2400.0, 2500.0, 2600.0, 2700.0, 2800.0, 2900.0, 3000.0, 3100.0, 3200.0, 3300.0, 3400.0, 3500.0, 3600.0, 3700.0, 3800.0, 3900.0, 4000.0, 4100.0, 4200.0, 4300.0, 4400.0, 4500.0, 4600.0, 4700.0, 4800.0]
py = [0.0004, 0.0, 0.0033, 0.006500000000000001, 0.0, 0.0, 0.004399999999999999, 0.6862, 0.0, 0.0, 0.0, 0.00019999999999997797, 0.0006000000000000449, 0.024499999999999966, 0.006400000000000072, 0.0043999999999999595, 0.019499999999999962, 0.03770000000000007, 0.01759999999999995, 0.015199999999999991, 0.018100000000000005, 0.04500000000000004, 0.0025999999999999357, 0.0, 0.0041000000000001036, 0.005999999999999894, 0.0042000000000000925, 0.0050000000000000044, 0.0041999999999999815, 0.0004999999999999449, 0.009199999999999986, 0.008200000000000096, 0.0, 0.0, 0.0046999999999999265, 0.0019000000000000128, 0.0006000000000000449, 0.02510000000000001, 0.0, 0.007199999999999984, 0.0, 0.012699999999999934, 0.0, 0.0, 0.008199999999999985, 0.005600000000000049, 0.0]

dy = rv_discrete(values=(y, py))

print(dy.mean(), dy.expect(), dy.expect(lb=1000.0, conditional=True))

Gives:

1400.79 617.58 0.0

Any ideas why dy.expect() doesn't match dy.mean()?

Upvotes: 2

Views: 363

Answers (1)

ev-br
ev-br

Reputation: 26040

This is a scipy bug. (The chunksize dependence is a smoking gun, there is no reason to do chunked iteration for this short sequences). The bug affects scipy versions 1.5.2 and below.

A suggested fix is at https://github.com/scipy/scipy/pull/12659

Upvotes: 2

Related Questions