disukumo
disukumo

Reputation: 341

Convert tuple to int - python

I have the following table ( df):

shape data
POLYGON ((1280 16068.18, 1294 16059, 1297 16060, 1300 16063, 1303 16065, 1308 16066))
POINT POINT ((37916311947 12769))
POLYGON POLYGON ((1906.23 12983, 1908 12982, 1916 12974, 1917 12972, 1917 12970))

I would like to convert the table to the following format:

Desired output:

converted_data
[(1280, 16068), (1294, 16059), (1297, 16060), (1300, 16063), (1303, 16065), (1308, 16066)]
[(37916311947, 12769)]
[(1906, 12983), (1908, 12982), (1916, 12974), (1917, 12972), (1917, 12970)]

I would like to modify the parenthesis and add comma and remove the word POLYGON or POINT. What I tried so far?

res1 = []
for ip, geom in zip(df2['data'], df2['SHAPE']):
    if geom == 'POINT':
        st = str(ip)[8:-2]
    elif geom == 'POLYGON/SURFACE':
        st = str(ip)[10:-2]
    s = st.split(',')
    res1.append(s)

res = []
for i in res1:
    res.append([tuple(map(int, j.split())) for j in i])

data2 = df2.copy()
data2['converted_data']=res
´´´

The above script works saves the output as tuple and not int. How do I optimize my script?

Upvotes: 0

Views: 309

Answers (2)

Epsi95
Epsi95

Reputation: 9047

df = pd.DataFrame([['POLYGON', '((1280 16068.18, 1294 16059, 1297 16060, 1300 16063, 1303 16065, 1308 16066))'],
                  ['POINT', 'POINT ((37916311947 12769))'],
                  ['POLYGON', 'POLYGON ((1906.23 12983, 1908 12982, 1916 12974, 1917 12972, 1917 12970))']], columns=['shape', 'data'])

df['data'] = df['data'].str.findall(r'(\d[\d.\s]+\d)').apply(lambda x: [tuple(map(lambda x: int(float(x)), i.split())) for i in x])

df
    shape   data
0   POLYGON [(1280, 16068), (1294, 16059), (1297, 16060), ...
1   POINT   [(37916311947, 12769)]
2   POLYGON [(1906, 12983), (1908, 12982), (1916, 12974), ...

Upvotes: 1

Mortz
Mortz

Reputation: 4879

The first part of your code seems fine - In the second part you are probably trying to split i instead of j

x = '1280 16068.18, 1294 16059, 1297 16060, 1300 16063, 1303 16065, 1308 16066'
x_split = [tuple(map(lambda x: int(float(x)), i.strip().split())) for i in x.strip().split(',')] 
#[(1280, 16068),
# (1294, 16059),
# (1297, 16060),
# (1300, 16063),
# (1303, 16065),
# (1308, 16066)]

Upvotes: 1

Related Questions