brohjoe
brohjoe

Reputation: 932

KeyError after combining date and hour columns, matplotlib won't recognize one of my columns, others work fine

I've just combined two columns ('date' and 'hour') from a .csv file and I'm attempting to plot out one of the columns in matplotlib. Everything worked fine up until I combined the two columns to make one column. Prior to that, I didn't have any errors. I've searched around and tried different things but I can't quite figure it out.

I have some sample data located at https://pastebin.com/j31zm3Xd. The columns are 'date', 'hour', 'open', 'low', 'close', and 'volume'.

Here is the code:

dataset = pd.read_csv(path)
# change the date and hour columns to datetime columns:
dataset['date'] = pd.to_datetime(dataset['date'])
dataset['hour'] = pd.to_datetime(dataset['hour'], format='%H').dt.time

# convert to strings and concatenate date and hour columns as one column:
dataset['datetime'] = pd.to_datetime(dataset.date.astype(str) + ' ' + dataset.hour.astype(str))

# drop the 'date' and 'hour columns as they are no longer needed:
dataset.drop(['date', 'hour'], axis=1, inplace=True)

# since the 'datetime column is at the end, move it to the front as the first column:
col = dataset.pop('datetime')
dataset.insert(0, 'datetime', col)

# convert the string 'datetime' column back to a panda datetime column:

For converting the string 'datetime' column to a panda datetime column, I tried these three approaches with the same result:

dataset['datetime'] = pd.to_datetime(dataset['datetime'])
dataset['datetime'] = dataset['datetime'].astype('datetime64[ns]')
dataset.datetime = pd.to_datetime(dataset.datetime)

plot the data for the 'high' column:

dataset[:].plot(y='high', linewidth=1)
plt.grid(which='both')
plt.show()

The three aforementioned approaches resulted in a "KeyError:'high'", as if it's not one of the columns.

Now here's the kicker...If you try to plot any of the other columns, It works like a charm. It's hating on the 'high' column for some reason.

Plot of the the other columns:

Plot of Open, Low and Close

I don't believe it's a matplotlib problem specifically because I get an error just referencing the 'high' column in the code such as listing the columns e.g. :

dataset.columns('open', 'high', 'low', 'close', 'volume')

I don't know what's wrong here. Help

Upvotes: 0

Views: 120

Answers (1)

Exi
Exi

Reputation: 320

dataset.columns gives this output: Index(['datetime', 'open', 'high ', 'low', 'close', 'volume'], dtype='object')

as you can recognize, there is a space in the 'high ' string

so with

dataset[:].plot(y='high ', linewidth=1)
plt.grid(which='both')
plt.show()

it works

Upvotes: 1

Related Questions