Reputation: 1371
I have MultiIndex
ed pandas Series
and am trying to plot each index in its own subplot, but it is running very slowly.
To accomplish the subplotting I am using a for loop over the outer level of MultiIndex, and plotting the Series using the inner index level as the x coordinate.
def plot_series( data ):
# create 16 subplots, corresponding to the 16 outer index levels
fig, axs = plt.subplots( 4, 4 )
for oi in data.index.get_level_values( 'outer_index' ):
# calculate subplot to use
row = int( oi/ 4 )
col = int( oi - row* 4 )
ax = axs[ row, col ]
data.xs( oi ).plot( use_index = True, ax = ax )
plt.show()
Each outer index level has 1000 data points, but the plotting takes several minutes to complete.
Is there a way to speed up the plotting?
Data
num_out = 16
num_in = 1000
data = pd.Series(
data = np.random.rand( num_out* num_in ),
index = pd.MultiIndex.from_product( [ np.arange( num_out ), np.arange( num_in ) ], names = [ 'outer_index', 'inner_index' ] )
)
Upvotes: 0
Views: 175
Reputation: 2730
Rather than loop through data.index.get_level_values( 'outer_index' )
, you could use data.groupby(level='outer_index')
and iterate through the grouped object using:
for name, group in grouped:
#do stuff
This removes the bottleneck that slicing the data frame using data.xs( oi )
creates.
def plot_series(data):
grouped = data.groupby(level='outer_index')
fig, axs = plt.subplots( 4, 4 )
for name, group in grouped:
row = int( name/ 4 )
col = int( name - row* 4 )
ax = axs[ row, col ]
group.plot( use_index = True, ax = ax )
plt.show()
num_out = 16
num_in = 1000
data = pd.Series(
data = np.random.rand( num_out* num_in ),
index = pd.MultiIndex.from_product( [ np.arange( num_out ), np.arange( num_in ) ], names = [ 'outer_index', 'inner_index' ] )
)
plot_series(data)
using timeit
you can see this approach is much faster:
%timeit plot_series(data)
795 ms ± 252 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Upvotes: 2