Kieran Cullinan
Kieran Cullinan

Reputation: 79

Splitting a Column using Pandas

I am trying to split the following column using Pandas: (df name is count)

Location                        count
POINT (-118.05425 34.1341)      355
POINT (-118.244512 34.072581)   337
POINT (-118.265586 34.043271)   284
POINT (-118.360102 34.071338)   269
POINT (-118.40816 33.943626)    241

to this desired outcome:

X-Axis        Y-Axis        count
-118.05425    34.1341       355
-118.244512   34.072581     337
-118.265586   34.043271     284
-118.360102   34.071338     269
-118.40816    33.943626     241

I have tried removing the word 'POINT', and both the brackets. But then I am met with an extra white space at the beginning of the column. I tried using:

count.columns = count.columns.str.lstrip()

But it was not removing the white space.

I was hoping to use this code to split the column:

count = pd.DataFrame(count.Location.str.split(' ',1).tolist(), 
                         columns = ['x-axis','y-axis'])

Since the space between both x and y axis could be used as the separator, but the white space.

Upvotes: 1

Views: 126

Answers (2)

Shubham Sharma
Shubham Sharma

Reputation: 71689

You can use .str.extract with regex pattern having capture groups:

df[['x-axis', 'y-axis']] = df.pop('Location').str.extract(r'\((\S+) (\S+)\)')

print(df)
   count       x-axis     y-axis
0    355   -118.05425    34.1341
1    337  -118.244512  34.072581
2    284  -118.265586  34.043271
3    269  -118.360102  34.071338
4    241   -118.40816  33.943626

Upvotes: 6

ABC
ABC

Reputation: 645

a quick solution can be:

(df['Location']
    .str.split(' ', 1)              # like what you did, 
    .str[-1]                        # select only lat,lon
    .str.strip('(')                 # remove open curly bracket
    .str.strip(')')                 # remove close curly bracket
    .str.split(' ', expand=True))   # expand to two columns 

then you may rename column names using .rename or df.columns = colnames

Upvotes: 5

Related Questions