Reputation: 19726
Say I have a (potentially multi-indexed, but maybe not) dataframe. For example:
iterables = [['foo', 'bar'], ['one', 'two']]
idx = pd.MultiIndex.from_product(iterables, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(4, 2), index=idx, columns=('metric1', 'metric2'))
result:
metric1 metric2
first second
foo one 0.189589 0.787533
two 0.176290 -0.080153
bar one 0.077977 -0.384613
two 0.658583 0.436177
There are many ways to iterate over each element in this dataframe, but most of them involve two nested for-loops like:
for r in df.index:
for c in df.columns:
print r, c, df.loc[r,c]
which yields
('foo', 'one') metric1 -0.00381142017312
('foo', 'one') metric2 -0.755465118408
('foo', 'two') metric1 0.444271742766
('foo', 'two') metric2 0.18390288873
('bar', 'one') metric1 0.512679930964
('bar', 'one') metric2 -0.134535924251
('bar', 'two') metric1 1.93222192752
('bar', 'two') metric2 0.609813960012
Is there a way to do that in one loop (such that I have access to row name(s) and column name(s) for each element as I'm iterating)? If it's only possible with a regular Index
I'd still be interested.
Upvotes: 0
Views: 34
Reputation: 215117
You can stack the data frame as a Series and then loop it in one go:
for ind, val in df.stack().items():
print(ind, val)
('foo', 'one', 'metric1') -0.752747101421
('foo', 'one', 'metric2') 0.318196702146
('foo', 'two', 'metric1') -0.737599211438
('foo', 'two', 'metric2') -1.08364260415
('bar', 'one', 'metric1') 1.87757917778
('bar', 'one', 'metric2') -2.29588862481
('bar', 'two', 'metric1') -0.301414352794
('bar', 'two', 'metric2') 0.610076176389
Upvotes: 1