Reputation: 2652
If I have the following pandas
DataFrame
:
>>> df
x y z
x 1 3 0
y 0 5 0
z 0 3 4
I want to iterate over the pairwise combinations of column names and row indices to perform certain operation. For example, for the pair of x
and y
, replace the 3 with 'xy'. The desired output will look like:
>>> df
x y z
x xx xy xz
y xy yy yz
z xz yz zz
a naïve code that I tried and doesn't work is:
for i, j in range(0,2):
df.loc[df.index[i], df.columns[j]] = df.index[i] + df.columns[j]
Upvotes: 2
Views: 4672
Reputation: 153510
How about a simple one-liner, using Pandas DataFrame elements:
df.apply(lambda x: x.index+x.name)
Output:
x y z
x xx xy xz
y yx yy yz
z zx zy zz
pd.DataFrame(np.add.outer(df.index, df.columns), index=df.index, columns=df.columns)
Output:
x y z
x xx xy xz
y yx yy yz
z zx zy zz
Upvotes: 10
Reputation: 14011
df.set_value()
is way faster, link to why: Set value for particular cell in pandas DataFrame
import pandas as pd
data = [{'x': 1, 'y': 2, 'z': 3}, {'x': 4, 'y': 5, 'z': 6}, {'x': 7, 'y': 8, 'z': 9}]
df = pd.DataFrame.from_dict(data, orient='columns')
df = df.astype(str)
df
# x y z
# 0 1 2 3
# 1 4 5 6
# 2 7 8 9
for idx, row in df.iterrows():
for column in list(df.columns.values):
val = str(idx) + str(column)
df.set_value(idx, column, val)
df
output:
x y z
0 0x 0y 0z
1 1x 1y 1z
2 2x 2y 2z
Note: set_value won't work if column names are not unique https://github.com/cm3/lafayettedb_thumbnail_getter/issues/3 . You will have to separately fix the non_unique column name problem.
If you don't care about column names you can prepone it with column #
df.columns = [str(idx) + '_' + name for idx, name in enumerate(df.columns)]
Upvotes: 2
Reputation: 59731
This should be really fast:
import numpy as np
grid = np.meshgrid(df.columns.values.astype(str),
df.index.values.astype(str))
result = np.core.defchararray.add(*grid)
You can then assign result
to either the same dataframe or another one.
Upvotes: 1
Reputation: 5193
for i, col in enumerate(df.columns):
print(df[i][col] + df[col][i])
df = pd.DataFrame(df[i][col] + df[col][i] for i, col in enumerate(df.columns))
This way you can iterate over all the columns and paired rows dynamically without needing to know how many columns there are.
Upvotes: 0
Reputation: 534
Is this what you are looking for?
>>> df
x y z
x 1 3 0
y 0 5 0
z 0 3 4
>>> for i in range(3):
... for j in range(3):
... df.loc[df.index[i], df.columns[j]] = df.index[i] + df.columns[j]
...
>>> df
x y z
x xx xy xz
y yx yy yz
z zx zy zz
Upvotes: 0