Reputation: 932
I've just combined two columns ('date' and 'hour') from a .csv file and I'm attempting to plot out one of the columns in matplotlib. Everything worked fine up until I combined the two columns to make one column. Prior to that, I didn't have any errors. I've searched around and tried different things but I can't quite figure it out.
I have some sample data located at https://pastebin.com/j31zm3Xd. The columns are 'date', 'hour', 'open', 'low', 'close', and 'volume'.
Here is the code:
dataset = pd.read_csv(path)
# change the date and hour columns to datetime columns:
dataset['date'] = pd.to_datetime(dataset['date'])
dataset['hour'] = pd.to_datetime(dataset['hour'], format='%H').dt.time
# convert to strings and concatenate date and hour columns as one column:
dataset['datetime'] = pd.to_datetime(dataset.date.astype(str) + ' ' + dataset.hour.astype(str))
# drop the 'date' and 'hour columns as they are no longer needed:
dataset.drop(['date', 'hour'], axis=1, inplace=True)
# since the 'datetime column is at the end, move it to the front as the first column:
col = dataset.pop('datetime')
dataset.insert(0, 'datetime', col)
# convert the string 'datetime' column back to a panda datetime column:
For converting the string 'datetime' column to a panda datetime column, I tried these three approaches with the same result:
dataset['datetime'] = pd.to_datetime(dataset['datetime'])
dataset['datetime'] = dataset['datetime'].astype('datetime64[ns]')
dataset.datetime = pd.to_datetime(dataset.datetime)
plot the data for the 'high' column:
dataset[:].plot(y='high', linewidth=1)
plt.grid(which='both')
plt.show()
The three aforementioned approaches resulted in a "KeyError:'high'", as if it's not one of the columns.
Now here's the kicker...If you try to plot any of the other columns, It works like a charm. It's hating on the 'high' column for some reason.
Plot of the the other columns:
I don't believe it's a matplotlib problem specifically because I get an error just referencing the 'high' column in the code such as listing the columns e.g. :
dataset.columns('open', 'high', 'low', 'close', 'volume')
I don't know what's wrong here. Help
Upvotes: 0
Views: 120
Reputation: 320
dataset.columns
gives this output:
Index(['datetime', 'open', 'high ', 'low', 'close', 'volume'], dtype='object')
as you can recognize, there is a space in the 'high ' string
so with
dataset[:].plot(y='high ', linewidth=1)
plt.grid(which='both')
plt.show()
it works
Upvotes: 1