Reputation: 458
I want to perform a string replace on my dataframe where I find all instances of "X" in a column and replace it with the column name.
ex
Name FFF1 H0L1
- L -
- X L
X - -
- - X
result df after replace
Name FFF1 H0L1
- FFF1 -
- FFF1 H0L1
Name - -
- - H0L1
It seems pretty straightforward, I am just confused on how to "reference" the column name. Thoughts?
Upvotes: 0
Views: 201
Reputation: 2407
The 'apply' method iterates over the columns as series which 'name' attribute corresponds to the column name:
df.apply(lambda col: col.where(~col.str.contains("X"), \
col.str.replace("X",col.name)) )
Even better:
df.apply(lambda col: col.str.replace("X",col.name))
Edit: Answering the additional question: Use regular expression:
#df.apply(lambda col: col.str.replace(r"([^X]|^)(X)([^X]|$)",r"\1"+col.name+r"\3")) # didn't work correctly in all situation, e.g.: "aXbXcXd"
df.apply(lambda col: col.str.replace(r"([^X]|^)(X)(?=[^X]|$)",r"\1"+col.name))
""" The details:
We create three pattern groups: (...)
[^X] can be any char but X (^ in square br. negates the chars)
^ as a separate char means start of string;
$ means end of string;
| means 'or'.
\1 and \2 mean the corresponding groups;
(?=...) lookahead check
"""
Edit 2: If there is always one char in the cell to be replaced:
df.apply(lambda col: col.replace(["X","L"],col.name))
Upvotes: 1
Reputation: 150785
you can use df.where
:
df = pd.DataFrame({"A": ['-', 'X'],
'B': ['X', '-']})
df.where(df.eq('X'), df.columns)
Output:
A B
0 A X
1 X B
Upvotes: 0