Reputation: 119
I'm trying to add python to my repertoire (R is my program of choice) and am having an issue with a simple line plot.
While the generated array (in this case, y) is of float type (which I want), when I plot a simple line plot using matplotlib, that same y is no truncated to the nearest whole integer.
Any help would be appreciated.
Thanks. Here's sample code. P.S. Any hints as to cleaning up the code would also be more than welcome.
import sys
import numpy as np
from numpy import random
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as matplotlib
plt.style.use('ggplot')
greens = np.array([0,0])
others = np.array(np.arange(1,37))
# no axis provided, array elements will be flattened
roulette = np.append(greens, others)
spins1000 = np.array(random.choice(roulette, size=(1000)))
# Create function for cum mean in python
def cum_mean(arr):
cum_sum = np.cumsum(arr, axis=0)
for i in range(cum_sum.shape[0]):
if i == 0:
continue
print(cum_sum[i] / (i + 1))
cum_sum[i] = cum_sum[i] / (i + 1)
return cum_sum
y = np.array(cum_mean(spins1000))
x = np.array(np.arange(1,1001))
fig, ax = plt.subplots(figsize=(10, 6))
ax.set(xlim=(0, 1000), ylim=(10.00, 25.00))
line = ax.plot(x, y, color='red', lw=1)[0]
plt.draw()
plt.show()
Upvotes: 0
Views: 869
Reputation: 80289
There are two things happening, which in combination cause the strange behavior.
cum_sum = np.cumsum(arr, axis=0)
with arr
being an array of integers, make cum_sum
also an array of integerscum_sum[i] = cum_sum[i] / (i + 1)
stores the result (which is a float) into an integer array; this storing rounds the numberA solution would either be to create cum_sum
as float (as in cum_sum = np.cumsum(arr, dtype=float)
). Or to do things "the numpy way", and create a new array in one go: return cum_sum / np.arange(1, cum_sum.shape[0] + 1)
. Note that numpy's array operations are vectorized, so dividing an array by an array gets the same result as dividing element by element. This runs quite faster (similar to what happens in R).
Also, if you would write cum_sum = cum_sum / np.arange(1, 1001)
, cum_sum
would be a new float array. Only by accessing it element-by-element, the array stays an array of integers. Note that np.arange()
already creates a numpy array, so calling np.array
again doesn't change it.
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('ggplot')
greens = np.array([0, 0])
others = np.arange(1, 37)
# no axis provided, array elements will be flattened
roulette = np.append(greens, others)
spins1000 = np.array(np.random.choice(roulette, size=(1000)))
# Create function for cum mean in python
def cum_mean(arr):
cum_sum = np.cumsum(arr)
return cum_sum / (np.arange(1, cum_sum.shape[0] + 1))
y = cum_mean(spins1000)
x = np.arange(1, 1001)
fig, ax = plt.subplots(figsize=(10, 6))
ax.set(xlim=(0, 1000), ylim=(10.00, 25.00))
line = ax.plot(x, y, color='red', lw=1)[0]
plt.show()
Upvotes: 1