Reputation: 521
I've properly converted my year column into a datetime index, however the month and date are inaccurate and unneeded seeing my dataset only includes year. I've used the format parameter to set year only, however it's still showing as "%Y-%M-%D" format.
Original data:
index song year artist genre
0 0 ego-remix 2009 beyonce knowles Pop
1 1 shes-tell-me 2009 save Rock
2 2 hello 2009 yta Pop
3 3 the rock 2009 term R&B
4 4 black-culture 2009 hughey Country
conducted a few more scrubbing techniques with the above code.
Then here are example rows from my dataframe code:
clean_df.index = pd.to_datetime(clean_df['year'], format='%Y')
clean_df = clean_df.drop(['index', 'year'], 1)
clean_df.sort_index(inplace=True)
clean_df.head()
year song artist genre
1970-01-01 hey now caravan Rock
1970-01-01 show me abc Rock
1970-01-01 hey now xyz Pop
1970-01-01 tell me foxy R&B
1970-01-01 move up curtis R&B
Is there any other method to be used to set index as annual only?
Upvotes: 3
Views: 14134
Reputation: 21
I had a similar issue. Solved it this way:
df['Year'] = df.Year.astype(np.datetime64)
df['Year'] = df.Year.dt.year
df.set_index('Year')
Output should only show the year with 4 digits.
Upvotes: 0
Reputation: 6091
You were close
clean_df.index = pd.to_datetime(clean_df['year'], format='%Y-%m-%d').year
It's hard to provide the actual correct format needed because I don't have your original data, but you just need to transform to date object and then call the year
parameter
Upvotes: 2