Reputation: 1168
I have 12 dataframes of the same shape for 12 years of data collection. I need to use this as a panel to to plot the various column values across the time series axis (years). Hence, I think I should align these frames as panels.
Some sample data:
# for 2015
Grave Crimes Cases Recorded Mistake of Law fact
Abduction 725 3
Kidnapping 246 6
Arson 466 1
Mischief 436 1
House Breaking 12707 21
Grievous Hurt 1299 3
# for 2016
Grave Crimes Cases Recorded Mistake of Law fact
Abduction 738 4
Kidnapping 297 9
Arson 486 4
Mischief 394 1
House Breaking 10287 14
Grievous Hurt 1205 0
# for 2017
Grave Crimes Cases Recorded Mistake of Law fact
Abduction 647 2
Kidnapping 251 10
Arson 418 3
Mischief 424 0
House Breaking 8913 12
Grievous Hurt 1075 1
Upvotes: 2
Views: 573
Reputation: 10860
Assuming your DataFrames are named like df15, df16, df17, you could create a panel with them like:
pnl = pd.Panel({2015: df15, 2016: df16, 2017: df17})
After that, you could do the 3D-plot you mentioned in your question thge following way:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
for i in range(2015, 2018):
ax.bar(pnl.major_axis.values, pnl[i]['Cases Recorded'], zdir='y', zs=i)
ax.yaxis.set_ticks(range(2015, 2018))
ax.yaxis.set_ticklabels(range(2015, 2018))
However, if I may give you a hint with respect to well readable data visualization from my own experience, which I think many professionals would share:
Even if a dataset is 3- or more-dimensional structured, it is often a good choice to create a well designed 2-d plot. 3D might often be an eye catcher, but to inform the target audience and to show certain properties of the data, you'll nearly almost go with 2d. Having this in mind, the approach of Ami Tavory would be the better way to go, as the data structure is then easier to handle:
df15['year'] = 2015
df16['year'] = 2016
df17['year'] = 2017
df = pd.concat([df15, df16, df17]).set_index(['Grave Crimes', 'year'])
f, ax = plt.subplots(1)
for i, y in enumerate(range(2015, 2018)):
data = df.groupby('year').get_group(y)['Cases Recorded']
ax.bar(np.arange(6)+.2*i, data.values, width=.2, label=str(y))
ax.legend()
ax.set_xticklabels(data.index, rotation=15)
Upvotes: 1
Reputation: 76297
While panels allow adding dimensions, hierarchical indexing is a more common replacement. E.g., from Python Data Science Handbook:
While Pandas does provide Panel and Panel4D objects that natively handle three-dimensional and four-dimensional data (see Aside: Panel Data), a far more common pattern in practice is to make use of hierarchical indexing (also known as multi-indexing) to incorporate multiple index levels within a single index. In this way, higher-dimensional data can be compactly represented within the familiar one-dimensional Series and two-dimensional DataFrame objects.
In your case
I have 12 dataframes of the same shape for 12 years of data collection. I need to use this as a panel to to plot the various column values across the time series axis (years).
Say your DataFrames are in df_2015
, df_2016
and df_2017
. You can do the following:
df_2015['year'] = 2015
df_2016['year'] = 2016
df_2017['year'] = 2017
df = pd.concat([df_2015, df_2016, df_2017]).set_index(['Grave Crimes', 'year'])
Now to get the data across all years for 'Abduction'
, for example, use
df[df.index.get_level_values(0) == 'Abduction']
Upvotes: 1