clifgray
clifgray

Reputation: 4419

Parsing Strings from json in Python

I am parsing a JSON document in Python and I have gotten nearly the whole process to work except I am having trouble converting a GPS string into the correct form.

I have the following form:

"gsx$gps":{"$t":"44°21′N 68°13′W\ufeff / \ufeff44.35°N 68.21°W\ufeff / 44.35; -68.21\ufeff (Acadia)"}

and that is from this HTML form:

44°21′N 68°13′W / 44.35°N 68.21°W / 44.35; -68.21 (Acadia)

and I want the final product to be a string that looks like this:

(44.35, -68.21)

here are a few other example JSON strings just to give you some more to work with:

"gsx$gps":{"$t":"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"}

"gsx$gps":{"$t":"38°41′N 109°34′W\ufeff / \ufeff38.68°N 109.57°W\ufeff / 38.68; -109.57\ufeff (Arches)"}

I have the following:

GPSlocation = entry['gsx$gps']['$t']

and then I don't know how to get GPSlocation into the form that I want above.

Upvotes: 0

Views: 659

Answers (4)

Need4Steed
Need4Steed

Reputation: 2180

re.sub(r'.+/ (-?\d{1,3}\.\d\d); (-?\d{1,3}\.\d\d)\\.+',
       "(\g<1>, \g<2>)",
       "44°21′N 68°13′W\ufeff / \ufeff44.35°N 68.21°W\ufeff / 44.35; -68.21\ufeff (Acadia)")

Upvotes: 0

Michael
Michael

Reputation: 7736

Here we go:

import json
jstr = """{"gsx$gps":{"$t":"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"}}"""
a = json.loads(jstr)
tuple(float(x) for x in a['gsx$gps']['$t'].split('/')[-1].split(u'\ufeff')[0].split(';'))

Gives:

(-14.25, -170.68)

Or from the plain string:

GPSlocation = u"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"
tuple(float(x) for x in GPSlocation.split('/')[-1].split(u'\ufeff')[0].split(';'))

Some timeit fancy, why to avoid fancy regexp ;)

import re
import timeit
setup='GPSlocation = u"14°15′S 170°41′W\ufeff / \ufeff14.25°S 170.68°W\ufeff / -14.25; -170.68\ufeff (American Samoa)"; import re'
print timeit.timeit("map(float, GPSlocation.split('/')[-1].split(u'\ufeff')[0].split(';'))", setup=setup)
print timeit.timeit("map(float, re.findall(r'(-?\d+(?:\.\d+)?)', GPSlocation)[-2:])", setup=setup)

5.89355301857
22.6919388771

Upvotes: 1

Blender
Blender

Reputation: 298176

You can extract the data with regex:

>>> import re
>>> text = '''"gsx$gps":{"$t":"44?21?N 68?13?W\ufeff / \ufeff44.35?N 68.21?W\ufeff / 44.35; -68.21\ufeff (Acadia)"}'''
>>> map(float, re.findall(r'(-?\d+(?:\.\d+)?)', text)[-2:])
[44.35, -68.21]

Upvotes: 0

Joran Beasley
Joran Beasley

Reputation: 113978

not super elegant but it works...also you are not parsing json ... just parsing a string...

import re
center_part = GPSLocation.split("/")[1]
N,W = centerpart.split()
N,W = N.split("\xb0")[0],W.split("\xb0")[0]
tpl = (N,W)
print tpl

on a side note these are not ints ...

Upvotes: 1

Related Questions