Reputation: 103
I'm reading a sql query and using it as dataframe columns.
query = "SELECT count(*) as numRecords, YEARWEEK(date) as weekNum FROM events GROUP BY YEARWEEK(date)"
df = pd.read_sql(query, connection)
date = df['weekNum']
records = df['numRecords']
The date column, which are int64 values, look like this:
...
201850
201851
201852
201901
201902
...
How can I convert the dataframe to a real date value (instead of int64), so when I plot this, the axis do not break because of the year change?
I'm using matplotlib.
Upvotes: 0
Views: 261
Reputation: 2510
All you need to do is use:
pd.to_datetime(date,format='%Y&%W')
It gave an error that Day should be mentioned to convert it into datetime. So to tackle that we attach a '-1' to the end (which means Monday... you can add any specific value from 0 to 6 where each represents a day).
Then grab the 'day of the week' using an additional %w in the format and it will work:
pd.to_datetime(date.apply(lambda x: str(x)+'-0'), format="%Y%W-%w")
Remember that to perform any of the above operations, the value in date dataframe or series should be a string object. If not, you can easily convert them using d.astype(str) and then perform all these operations.
Upvotes: 2