hsquared
hsquared

Reputation: 359

Adding Lat Lon coordinates to separate columns (python/dataframe)

I'm sure this is a simple thing to do but I am new to Python and cannot work it out!

I have a data frame with one column containing coordinates and I am wanting to remove the brackets and add the Lat/Lon values into separate columns.

Current dataframe:

gridReference
(56.37769816725615, -4.325049868061924) 
(56.37769816725615, -4.325049868061924) 
(51.749167440074324, -4.963575226888083)   

wanted dataframe:

Latitude               Longitude
56.37769816725615     -4.325049868061924
56.37769816725615     -4.325049868061924
51.749167440074324    -4.963575226888083 

Thanks for your help

EDIT: I have tried:

df['lat'], df['lon'] = df.gridReference.str.strip(')').str.strip('(').str.split(', ').values.tolist()

but I get the error:

AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas

I then tried adding:

df['gridReference'] = df['gridReference'].astype('str')

and got the error:

ValueError: too many values to unpack (expected 2)

Any help would be appreciated as I am not sure how to make this work! :)

EDIT: I keep getting the error AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas

the output for df.dtypes is:

<class 'pandas.core.frame.DataFrame'> Int64Index: 22899 entries, 0 to 22898 Data columns (total 1 columns): LatLon 22899 non-null object dtypes: object(1)

the output for df.info() is:

gridReference object dtype: object

Upvotes: 5

Views: 8122

Answers (3)

mouli ravindran
mouli ravindran

Reputation: 31

Answering to the question: Input is a column with tuples similar column b in my code. Output needed is two columns which is similar to b1 and b2 column in my answer.

Created a DataFrame: In [2]: df = pd.DataFrame({'a':[1,2], 'b':[(1,2), (3,4)]})

In [3]: df                                                                                                                                                                      
Out[3]: 
   a       b
0  1  (1, 2)
1  2  (3, 4)

Convert a column to list:

In [4]: df['b'].tolist()                                                                                                                                                        
Out[4]: [(1, 2), (3, 4)]

Create required dataframe [Output required] using the list:

In [5]: pd.DataFrame(df['b'].tolist(), index=df.index)                                                                                                                                          
Out[5]: 
   0  1
0  1  2
1  3  4

We can also try to get the output in the same dataframe using the below code: In [6]: df[['b1', 'b2']] = pd.DataFrame(df['b'].tolist(), index=df.index)

In [7]: df                                                                                                                                                                      
Out[7]: 
   a       b  b1  b2
0  1  (1, 2)   1   2
1  2  (3, 4)   3   4

Upvotes: 2

Nickil Maveli
Nickil Maveli

Reputation: 29711

df['gridReference'].str.strip('()')                               \
                   .str.split(', ', expand=True)                   \
                   .rename(columns={0:'Latitude', 1:'Longitude'}) 

             Latitude           Longitude
0   56.37769816725615  -4.325049868061924
1   56.37769816725615  -4.325049868061924
2  51.749167440074324  -4.963575226888083

Upvotes: 8

Asish M.
Asish M.

Reputation: 2647

>>> df = pd.DataFrame({'latlong': ['(12, 32)', '(43, 54)']})
>>> df
    latlong
0  (12, 32)
1  (43, 54)

>>> split_data = df.latlong.str.strip(')').str.strip('(').str.split(', ')
>>> df['lat'] = split_data.apply(lambda x: x[0])
>>> df['long'] = split_data.apply(lambda x: x[1])
    latlong lat long
0  (12, 32)  12   43
1  (43, 54)  32   54

Upvotes: 0

Related Questions