Reputation: 10051
Given a small test data as follows:
id city address
0 1 HK 55 Peng Sui Road Che Sham Man Hong Kong
1 2 HK 77 Kun Chok Fong San Tsuen Noi Fok Kowloon
2 3 HK 48 Nut Chok Lou Tsa Sik Kowloon Hong Kong
3 4 HK Block 69, Hai Ang Court Aberdeen Kowloon
4 5 HK 40 Tsang Tai Kit Street Wak Luet New Territories
If Hong Kong
is not contained in address
, then I want to create a new column detailed_add
by combining df['address'] + ', ' + df['city']
:
id city address detailed_add
0 1 HK 55 Peng Sui Road Che Sham Man Hong Kong 55 Peng Sui Road Che Sham Man Hong Kong
1 2 HK 77 Kun Chok Fong San Tsuen Noi Fok Kowloon 77 Kun Chok Fong San Tsuen Noi Fok Kowloon, HK
2 3 HK 48 Nut Chok Lou Tsa Sik Kowloon Hong Kong 48 Nut Chok Lou Tsa Sik Kowloon Hong Kong
3 4 HK Block 69, Hai Ang Court Aberdeen Kowloon Block 69, Hai Ang Court Aberdeen Kowloon, HK
4 5 HK 40 Tsang Tai Kit Street Wak Luet New Territories 40 Tsang Tai Kit Street Wak Luet New Territori..., HK
To filter rows: df[~df['address'].str.contains('Hong Kong', na = False)]
.
id city address
1 2 HK 77 Kun Chok Fong San Tsuen Noi Fok Kowloon
3 4 HK Block 69, Hai Ang Court Aberdeen Kowloon
4 5 HK 40 Tsang Tai Kit Street Wak Luet New Territories
How could I do that? Thanks.
Upvotes: 0
Views: 34
Reputation: 26676
df["detailed_add"]=np.where(~df.address.str.contains("Hong Kong"), df['address'].str.cat(df['city'], sep=","),df['address'])
df.head(2)
id city address \
0 1 HK 55 Peng Sui Road Che Sham Man Hong Kong
1 2 HK 77 Kun Chok Fong San Tsuen Noi Fok Kowloon
detailed_add
0 55 Peng Sui Road Che Sham Man Hong Kong
1 77 Kun Chok Fong San Tsuen Noi Fok Kowloon,HK
Upvotes: 1
Reputation: 30050
You can use np.where()
df['detailed_add'] = np.where(df['address'].str.contains('Hong Kong'), df['address'], df['address'] + ', ' + df['city'])
print(df)
id city address detailed_add
0 1 HK 55 Peng Sui Road Che Sham Man Hong Kong 55 Peng Sui Road Che Sham Man Hong Kong
1 2 HK 77 Kun Chok Fong San Tsuen Noi Fok Kowloon 77 Kun Chok Fong San Tsuen Noi Fok Kowloon, HK
2 3 HK 48 Nut Chok Lou Tsa Sik Kowloon Hong Kong 48 Nut Chok Lou Tsa Sik Kowloon Hong Kong
3 4 HK Block 69, Hai Ang Court Aberdeen Kowloon Block 69, Hai Ang Court Aberdeen Kowloon, HK
4 5 HK 40 Tsang Tai Kit Street Wak Luet New Territories 40 Tsang Tai Kit Street Wak Luet New Territori...
Upvotes: 2
Reputation: 521914
You may try the following:
df["new_column"] = df["address"] if df["address"].str.contains(r'\bHong Kong\b') else df["address"] + ', HK'
The above logic just pulls over the current address if it already contains Hong Kong in it. Otherwise, it appends , HK
to the end of the current address.
Upvotes: 1