Reputation: 87
I have a DataFrame:
A B C
0 PQ None None
1 Chieti None None
2 Gainesville None None
3 Wenzhou 325027 None
4 Boston None None
5 D-53127 Bonn None
6 SE-11282 Stockholm None
7 Birmingham None None
8 Miami None None
9 Los Angeles None
I want to combine the text in all columns and create a new column which is column 'D' here .However I do not want to include the column if there are numbers or digits in that column . My final output should look like this:
A B C D
0 PQ None None PQ
1 Chieti None None Chieti
2 Gainesville None None Gainesville
3 Wenzhou 325027 None Wenzhou
4 Boston None None Boston
5 D-53127 Bonn None Bonn
6 SE-11282 Stockholm None Stockholm
7 Birmingham None None Birmingham
8 Miami None None Miami
9 Los Angeles None LosAngeles
Could the knowledgeable folks in Stackoverflow give me an insight into solving this using PANDAS/python
Upvotes: 0
Views: 239
Reputation:
You could do this by applying a function over the rows with apply(...,axis=1)
.
import pandas as pd
def row_func(row):
def num_there(s):
return any(i.isdigit() for i in s)
result = []
for x in row:
if x and not num_there(str(x)):
result.append(str(x))
return ''.join(result)
df['D'] = df.apply(row_func, axis=1)
Upvotes: 1
Reputation: 19947
You can mask the cells containing digits can then concat all columns"
df['D'] = df[~df.fillna('').apply(lambda x: x.str.contains('\d'))]\
.fillna('').apply(lambda x: ''.join(x), axis=1)
df
Out[810]:
A B C D
0 PQ None None PQ
1 Chieti None None Chieti
2 Gainesville None None Gainesville
3 Wenzhou 325027 None Wenzhou
4 Boston None None Boston
5 D-53127 Bonn None Bonn
6 SE-11282 Stockholm None Stockholm
7 Birmingham None None Birmingham
8 Miami None None Miami
9 Los Angeles None LosAngeles
Upvotes: 1