Reputation: 797
I have a string that looks like this:
POLYGON ((148210.445767647 172418.761192525, 148183.930888667 172366.054787545, 148183.866770629 172365.316772032, 148184.328078148 172364.737139913, 148220.543522168 172344.042601933, 148221.383518338 172343.971823159), (148221.97916844 172344.568316375, 148244.61381946 172406.651932395, 148244.578100039 172407.422441673, 148244.004662562 172407.938319453, 148211.669446582 172419.255646473, 148210.631989339 172419.018894911, 148210.445767647 172418.761192525))
I can easily strip POLYGON
out of the string to focus on the numbers but I'm kinda wondering what would be the easiest/best way to parse this string into a list of dict.
The first parenthesis (right after POLYGON) indicates that multiple elements can be provided (separated by a comma ,
).
So each pair of numbers is to supposed to be x
and y
.
I'd like to parse this string to end up with the following data structure (using python 2.7
):
list [ //list of polygons
list [ //polygon n°1
dict { //polygon n°1's first point
'x': 148210.445767647, //first number
'y': 172418.761192525 //second number
},
dict { //polygon n°1's second point
'x': 148183.930888667,
'y': 148183.930888667
},
... // rest of polygon n°1's points
], //end of polygon n°1
list [ // polygon n°2
dict { // polygon n°2's first point
'x': 148221.9791684,
'y': 172344.568316375
},
... // rest of polygon n°2's points
] // end of polygon n°2
] // end of list of polygons
Polygons' number of points is virtually infinite.
Each point's numbers are separated by a blank.
Do you guys know a way to do this in a loop or any recursive way ?
PS: I'm kind of a python beginner (only a few months under my belt) so don't hesitate to explain in details. Thank you!
Upvotes: 0
Views: 181
Reputation: 15390
Lets say u have a string that looks like this
my_str = 'POLYGON ((148210.445767647 172418.761192525, 148183.930888667 172366.054787545, 148183.866770629 172365.316772032, 148184.328078148 172364.737139913, 148220.543522168 172344.042601933, 148221.383518338 172343.971823159), (148221.97916844 172344.568316375, 148244.61381946 172406.651932395, 148244.578100039 172407.422441673, 148244.004662562 172407.938319453, 148211.669446582 172419.255646473, 148210.631989339 172419.018894911, 148210.445767647 172418.761192525))'
my_str = my_str.replace('POLYGON ', '')
coords_groups = my_str.split('), (')
for coords in coords_groups:
coords.replace('(', '').replace(')', '')
coords_list = coords.split(', ')
coords_list2 = []
for item in coords_list:
item_split = item.split(' ')
coords_list2.append({'x', item_split[0], 'y': item_split[1]})
I think this should help a little
All u need now is a way to get info between parenthesis, this should help Regular expression to return text between parenthesis
UPDATE updated code above thanks to another answer by https://stackoverflow.com/users/2635860/mccakici , but this works only if u have structure of string as u have said in your question
Upvotes: 1
Reputation: 550
can you try?
import ast
POLYGON = '((148210.445767647 172418.761192525, 148183.930888667 172366.054787545, 148183.866770629 172365.316772032, 148184.328078148 172364.737139913, 148220.543522168 172344.042601933, 148221.383518338 172343.971823159), (148221.97916844 172344.568316375, 148244.61381946 172406.651932395, 148244.578100039 172407.422441673, 148244.004662562 172407.938319453, 148211.669446582 172419.255646473, 148210.631989339 172419.018894911, 148210.445767647 172418.761192525))'
new_polygon = '(' + POLYGON.replace(', ', '),(').replace(' ', ',') + ')'
data = ast.literal_eval(new_polygon)
result_list = list()
for items in data:
sub_list = list()
for item in items:
sub_list.append({
'x': item[0],
'y': item[1]
})
result_list.append(sub_list)
print result_list
Upvotes: 1
Reputation: 44093
The data structure you have defining your Polygon object looks very similar to a python tuple declaration. One option, albeit a bit hacky would be to use python's AST parser.
You would have to strip off the POLYGON part and this solution may not work for other declarations that are more complex.
import ast
your_str = "POLYGON (...)"
# may be better to use a regex to split off the class part
# if you have different types
data = ast.literal_eval(your_str.replace("POLYGON ",""))
x, y = data
#now you can zip the two x and y pairs together or make them into a dictionary
Upvotes: 2