Reputation: 77
I would like to improve my bitcoin dataset but I found that the date is not sorted in the right way and want to show only the month and year. How can I do it?
data = Bitcoin_Historical['Price']
Date1 = Bitcoin_Historical['Date']
train1 = Bitcoin_Historical[['Date','Price']]
#Setting the Date as Index
train2 = train1.set_index('Date')
train2.sort_index(inplace=True)
cols = ['Price']
train2 = train2[cols].apply(lambda x: pd.to_numeric(x.astype(str)
.str.replace(',',''), errors='coerce'))
print (type(train2))
print (train2.head())
plt.figure(figsize=(15, 5))
plt.plot(train2)
plt.xlabel('Date', fontsize=12)
plt.xlim(0,20)
plt.ylabel('Price', fontsize=12)
plt.title("Closing price distribution of bitcoin", fontsize=15)
plt.gcf().autofmt_xdate()
plt.show()
The result shows picture below:
It's not ordered and shows all dates. I would like to order by month+year and show only the month name+year. How can that be done?
Example of Data:
Thank you
Upvotes: 4
Views: 5005
Reputation: 5597
I've made the following edits to your code:
Date
column as datetime typePrice
column and converting to floatplt.xlim(0,20)
which is causing the output to display 1970Please try the code below:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
pd.options.mode.chained_assignment = None
Bitcoin_Historical = pd.read_csv('data.csv')
train1 = Bitcoin_Historical[['Date','Price']]
train1['Date'] = pd.to_datetime(train1['Date'], infer_datetime_format=True, errors='coerce')
train1['Price'] = train1['Price'].str.replace(',','').str.replace(' ','').astype(float)
train2 = train1.set_index('Date') #Setting the Date as Index
train2.sort_index(inplace=True)
print (type(train2))
print (train2.head())
ax = train2.plot(figsize=(15, 5))
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=1))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%b'))
plt.xlabel('Date', fontsize=12)
plt.ylabel('Price', fontsize=12)
plt.title("Closing price distribution of bitcoin", fontsize=15)
plt.show()
Output
Upvotes: 2
Reputation: 599
Try to cast your "Date" column into datetime, check if it does the trick:
train1.Date = pd.to_datetime(train1.Date)
train2 = train1.set_index('Date')
Upvotes: 1