Reputation: 1473
I'm very new to Python Programming. So, I'm trying to learn Python by a book called 'Python Crash Course' But problem occured while I was working on using fill_between method in matplotlib. Here is my code.
import csv
from datetime import datetime
from matplotlib import pyplot as plt
# Read min, max temperatures from the file
filename = 'sitka_weather_2014.csv'
with open(filename) as f:
reader = csv.reader(f)
header_row = next(reader)
dates, highs, lows = [], [], []
for row in reader:
current_date = datetime.strptime(row[0], "%Y-%m-%d")
dates.append(current_date)
high = int(row[1])
highs.append(row[1])
low = int(row[3])
lows.append(low)
# plotting the data
fig = plt.figure(dpi=128, figsize=(12, 6))
plt.plot(dates, highs, c='red', alpha=0.5)
plt.plot(dates, lows, c='blue', alpha=0.5)
plt.fill_between(dates, highs, lows, facecolor='blue', alpha=0.1)
# deciding graph format
plt.title("Daily high and low temperature - 2014", fontsize=24)
plt.xlabel('Numbers', fontsize=14)
fig.autofmt_xdate()
plt.ylabel("Temperature (F)", fontsize=16)
plt.tick_params(axis='both', which='major', labelsize=16)
plt.show()
The code above is trying to plotting the data of temperatures. When I'm trying to run the code, pyCharm gives me this Traceback.
Traceback (most recent call last):
File "C:/pyCharm(sang)/highs_lows.py", line 28, in <module>
plt.fill_between(dates, highs, lows, facecolor='blue', alpha=0.1)
File "C:\Users\John Jung\AppData\Roaming\Python\Python36\site-packages\matplotlib\pyplot.py", line 3000, in fill_between
**kwargs)
File "C:\Users\John Jung\AppData\Roaming\Python\Python36\site-packages\matplotlib\__init__.py", line 1898, in inner
return func(ax, *args, **kwargs)
File "C:\Users\John Jung\AppData\Roaming\Python\Python36\site-packages\matplotlib\axes\_axes.py", line 4779, in fill_between
y1 = ma.masked_invalid(self.convert_yunits(y1))
File "C:\Users\John Jung\AppData\Roaming\Python\Python36\site-packages\numpy\ma\core.py", line 2388, in masked_invalid
condition = ~(np.isfinite(a))
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
Now I'm using PyCharm on Windows10, so it doesn't work. But I ran this exactly same code on my mac last night, it worked like a magic. Why does this do not work on Windows but Mac? What's the fill_between's problem?
Thanks in advance guys!
update: This is the sample code of sitka_weather_2014.csv
AKST,Max TemperatureF,Mean TemperatureF,Min TemperatureF,Max Dew PointF,MeanDew PointF,Min DewpointF,Max Humidity, Mean Humidity, Min Humidity, Max Sea Level PressureIn, Mean Sea Level PressureIn, Min Sea Level PressureIn, Max VisibilityMiles, Mean VisibilityMiles, Min VisibilityMiles, Max Wind SpeedMPH, Mean Wind SpeedMPH, Max Gust SpeedMPH,PrecipitationIn, CloudCover, Events, WindDirDegrees
2014-1-1,46,42,37,40,38,36,97,86,76,29.95,29.77,29.57,10,8,2,25,14,36,0.69,8,Rain,138
2014-1-2,41,38,35,38,35,32,97,89,76,30.09,29.90,29.81,10,9,4,14,7,22,0.34,8,Rain,92
2014-1-3,39,36,34,38,36,33,100,97,93,30.43,30.32,30.10,10,9,2,8,3,,0.02,7,Rain,102
2014-1-4,43,38,34,35,33,31,97,82,62,30.43,30.32,30.20,10,10,10,20,6,25,0.00,6,Rain,107
2014-1-5,44,42,41,42,36,32,97,77,63,30.20,30.02,29.88,10,8,2,26,17,36,0.37,8,Rain,113
Upvotes: 1
Views: 374
Reputation: 1780
The error message here doesn't really help identify the root issue. The problem is in the code that pulls the data from the csv file. Here's the relevant part of that code:
dates, highs, lows = [], [], []
for row in reader:
current_date = datetime.strptime(row[0], "%Y-%m-%d")
dates.append(current_date)
high = int(row[1])
highs.append(row[1])
low = int(row[3])
lows.append(low)
When the data is read from the csv file, it's intially read in as a string. For each piece of data we're going to use, we need to convert it to the appropriate kind of data for plotting. So the date gets converted to a datetime object, and each temperature gets converted to an int. In your code, you've converted the high temperature to an int, but you haven't used that int. This line:
highs.append(row[1])
should be changed to:
highs.append(high)
If you make this change, I think you'll see the correct visualization. I wonder if isinfinite()
was rejecting the string data?
Note: I'm the author of Python Crash Course, and I try to keep an eye out for posts like this. I really want to know that the projects are continuing to work for people, and update them if they are becoming out of date at all.
Upvotes: 0
Reputation: 3094
It's as I "feared" in the comments. Numpy's isinfinite
does not support objects. Because of that you will have to plot with the MJD (or whatever purely numerical date format you feel comfortable with) and use the tick formatter to make it look like a common-place date.
You can do that by doing
numdates = []
for date in dates:
numdates.append(date.toordinal())
or
numdates = matplotlib.dates.date2num(dates)
then you can easily do
plt.plot(numdates, highs, c="red")
plt.plot(numdates, lows, c="blue")
plt.fill_between(numdates, high, lows, facecolor="blue", alpha=0.1)
plt.show()
of course now you will notice your x-axis is not in the easy-to-read format. Instead it's some big number representing the number of seconds since 1970 or something down those lines.
An easy fix for that that doesn't quite always work is to use the plot_date
like so:
plt.plot_date(dates, highs, c="red", ls="-")
plt.plot_date(dates, lows, c="blue", ls="-")
plt.fill_between(numdates, high, lows, facecolor="blue", alpha=0.1)
plt.show()
Notice I declared explicitly the linestyle
or ls
for the plot_date
because plot_date
is actually a scatter-plot of points. Notice also how dates
is used for the plot_date
but numdates
is used for fill_between
and the plot still works. This is because plot_dates
just tries guessing the DateFormatter
for you in the background but the actual numbers are the same as from the top example.
Unfortunately formatting for the plot_date
can sometimes be a bit off. In that case I recommend you just brave the DateFormatter
yourself, it's not that bad. If you want to hide the circles that get drawn for dates to just leave the line visible you can add a marker=","
in the plot_date
commands. This just draws a single pixel for the scatter point so it's hidden by the line, see more here. Also see other options for the plot_date
func. here.
As to why this only happens sometimes - 9/10 chances it's numpy and python version related. This will pop up for numpy 1.12.1, matplotlib 2.0.2 and Python 3.6. I suspect that it would also happen for an older Python version (i.e. 2.7) and that it miiight not happen for some versions of numpy/matplotlib. All of this is speculation of course. As far as "why" goes - I think this is an good decision on numpy's part, but matplotlib should be reworked to hide this from the user. If you want you can try pinging them on git to see what they have to say about it. If you don't want to, say so - I'm quite interested in seeing why.
Upvotes: 1