Reputation: 8978
I have a dataframe with two columns one is Date
and the other one is Location(Object)
datatype, below is the format of Location columns with values :
Date Location
1 07/12/1912 AtlantiCity, New Jersey
2 08/06/1913 Victoria, British Columbia, Canada
3 09/09/1913 Over the North Sea
4 10/17/1913 Near Johannisthal, Germany
5 03/05/1915 Tienen, Belgium
6 09/03/1915 Off Cuxhaven, Germany
7 07/28/1916 Near Jambol, Bulgeria
8 09/24/1916 Billericay, England
9 10/01/1916 Potters Bar, England
10 11/21/1916 Mainz, Germany
my requirement is to split the Location by ","
separator and keep only the second part of it (ex. New Jersey, Canada, Germany, England etc..)
in the Location column. I also have to check if its only a single element (values with single element having no ",")
Is there a way I can do it with predefined method without looping each and every row ?
Sorry if the question is off the standard as I am new to Python and still learning.
Upvotes: 3
Views: 1729
Reputation: 886948
We could try with str.extract
print(df['Location'].str.extract(r'([^,]+$)'))
#0 New Jersey
#1 Canada
#2 Over the North Sea
#3 Germany
#4 Belgium
#5 Germany
#6 Bulgeria
#7 England
#8 England
#9 Germany
Upvotes: 1
Reputation: 214927
A straight forward way is to apply
the split
method to each element of the column and pick up the last one:
df.Location.apply(lambda x: x.split(",")[-1])
1 New Jersey
2 Canada
3 Over the North Sea
4 Germany
5 Belgium
6 Germany
7 Bulgeria
8 England
9 England
10 Germany
Name: Location, dtype: object
To check if each cell has only one element we can use str.contains
method on the column:
df.Location.str.contains(",")
1 True
2 True
3 False
4 True
5 True
6 True
7 True
8 True
9 True
10 True
Name: Location, dtype: bool
Upvotes: 3