Reputation: 942
I have a series whose index is in month-day format. It is not officially a datetime index. In addition, the series contains one datum for each day of the year:
Data_Value
01-01 156
01-02 139
01-03 133
01-04 106
01-05 128
01-06 189
My goal is to make a line plot with the "Data_Value" in the y-axis and the twelve months in the x-axis. But I want to plot the 365 data, not aggregate them into 12 months. Something like this:
Anyway, my first movement has been to plot the 365 data ignoring the fact of putting the months names in the x-axis:
s = np.array(s)
plt.figure()
plt.plot(s, '-o')
But in the last command I got the error "ValueError: could not convert string to float: '12-31'".
Does anybody how to convert "informal" month-day index to a datetime index omitting the year? Or any other solution to reach my final goal of plotting the series? Thx.
Upvotes: 3
Views: 797
Reputation: 942
This is what I have done, as jezrael suggested:
Add one random year to index to convert it to datetime index:
s.index = '2014-' + s.index.astype(str)
s.index = pd.to_datetime(s.index)
days_s = s.index
days_s = np.array(days_s)
Then set and draw the plot:
plt.figure()
plt.plot(days_s, s, '-o')
And change x-axis from %y-%m-%d to %m format using mdates.DateFormatter:
import matplotlib.dates as mdates
monthsFmt = mdates.DateFormatter('%m')
plt.gca().xaxis.set_major_formatter(monthsFmt)
Finally, render plot:
plt.show()
I do not know if this solution is a bit sloppy but it works.
Upvotes: 0
Reputation: 862751
You can use:
np.random.seed(100)
rng = pd.date_range('2017-01-01', periods=365).strftime('%m-%d')
df = pd.DataFrame({ 'Data_value': np.random.randint(1000, size=365)}, index=rng)
#print (df)
d = {'01':'Jan', '02':'Feb', '03':'Mar','04': 'Apr', '05':'May','06': 'Jun',
'07':'Jul', '08':'Aug','09': 'Sep','10': 'Oct', '11':'Nov','12': 'Dec'}
#add _for match only first value of string to dict
d = {'_' + k:v for k, v in d.items()}
#add _ to index
df.index = '_' + df.index
#split values by - to MultiIndex
df.index = df.index.str.split('-', expand=True)
#reshape and replace NaN to 0
df = df['Data_value'].unstack(fill_value=0)
#rename index values by dict
df = df.rename(index=d)
print (df)
01 02 03 04 05 06 07 08 09 10 ... 22 23 24 \
Jan 520 792 835 871 855 79 944 906 350 948 ... 316 570 912
Feb 900 415 897 141 757 723 612 4 603 955 ... 2 889 617
Mar 181 283 824 238 369 926 944 303 679 877 ... 618 30 17
Apr 693 846 0 13 185 460 362 131 582 643 ... 811 36 773
May 852 95 626 749 631 76 801 314 102 938 ... 419 407 765
Jun 677 870 122 628 186 295 619 734 819 286 ... 16 524 854
Jul 138 776 473 712 414 908 658 349 887 604 ... 389 435 346
Aug 385 14 883 289 148 168 536 477 442 796 ... 730 250 477
Sep 82 998 401 906 653 593 885 793 194 655 ... 944 754 506
Oct 144 819 182 183 83 502 356 554 957 760 ... 70 309 994
Nov 674 131 870 139 305 797 804 861 451 922 ... 723 119 71
Dec 781 304 466 544 294 296 497 693 93 398 ... 915 716 322
25 26 27 28 29 30 31
Jan 507 649 93 86 386 667 876
Feb 478 403 994 63 0 0 0
Mar 53 68 946 488 347 475 979
Apr 839 38 214 94 738 170 0
May 521 944 496 789 409 438 262
Jun 466 621 67 220 223 788 0
Jul 34 451 862 974 694 77 212
Aug 736 74 437 798 67 668 933
Sep 693 657 705 298 861 172 0
Oct 736 943 944 905 689 821 879
Nov 829 93 498 804 123 554 0
Dec 141 87 65 324 182 640 343
[12 rows x 31 columns]
Last DataFrame.plot
:
df.plot()
Upvotes: 0