Reputation: 384
I am curious as to how to remove string entries from a Pandas DF beginning with a letter and two numbers and replacing with NaN.
A B C D
Apple Pear N45 82f John
Cat P48 hH2 Mary Sponge
Hat P67 De1 Bed S90 GGGF
I would like to replace all entries across the DF beginning with a letter and two numbers with NaN.
I have tried something along the lines of
for columns in df.columns[1:]:
for i in columns:
if i[0].isalpha() and i[1].isdigit and i.[2].isdigit():
i.replace(i,None)
Unfortunately this not seem to function. Any help would be appreciated.
Upvotes: 1
Views: 125
Reputation: 402483
Use stack
and str.extract
with a pattern that does not match what you want to match (this way, they're replaced with NaNs).
df.stack().str.extract(r'(^[^a-z]\D{2}.*)').unstack()[0]
A B C D
0 Apple Pear NaN John
1 Cat NaN Mary Sponge
2 Hat NaN Bed NaN
Upvotes: 1
Reputation: 153460
You can try this:
df.mask(df.apply(lambda r: r.str.contains('[a-zA-Z]{1}\d{2}')))
Output:
A B C D
0 Apple Pear NaN John
1 Cat NaN Mary Sponge
2 Hat NaN Bed NaN
I like @coldspeed's stack too:
df[~df.stack().str.contains('[a-zA-Z]{1}\d{2}').unstack()]
Output:
A B C D
0 Apple Pear NaN John
1 Cat NaN Mary Sponge
2 Hat NaN Bed NaN
Upvotes: 1