Reputation: 17
I have a column in a df that I want to split into two columns splitting by comma delimiter. If the value in that column does not have a comma I want to put that into the second column instead of first.
Origin |
---|
New York, USA |
England |
Russia |
London, England |
California, USA |
USA |
I want the result to be:
Location | Country |
---|---|
New York | USA |
NaN | England |
NaN | Russia |
London | England |
California | USA |
NaN | USA |
I used this code
df['Location'], df['Country'] = df['Origin'].str.split(',', 1)
Upvotes: 1
Views: 1448
Reputation: 8780
Here is a way by using str.extract()
and named groups
df['Origin'].str.extract(r'(?P<Location>[A-Za-z ]+(?=,))?(?:, )?(?P<Country>\w+)')
or
df['Origin'].str.extract(r'(?:(.*), )?(\w+)')
Output:
Location Country
0 New York USA
1 NaN England
2 NaN Russia
3 London England
4 California USA
5 NaN USA
Upvotes: 0
Reputation: 522817
We can try using str.extract
here:
df["Location"] = df["Origin"].str.extract(r'(.*),')
df["Country"] = df["Origin"].str.extract(r'(\w+(?: \w+)*)$')
Upvotes: 2