Zettt
Zettt

Reputation: 899

Split string to dict, where delimiter is part of value

I can't figure out a solution to this problem.

I have this example string:

test4 = "versandkostenfrei=Ja,delivery_time=sofort lieferbar,instantly_deliverable=true,spannung=7,2 Volt"

I'd like to convert this into a dict, where everything before the = sign is a key, and everything after is the value, up until the comma. The major problem is that some of the values (after the equal sign) contain a comma themselves. Also looking at the entire data set, it's possible that this last bit here spannung=7,2 Volt is somewhere in the middle of the string.

Desired output:

{
  "versandkostenfrei": "Ja",
  "delivery_time": "sofort lieferbar",
  "instantly_deliverable": "true",
  "spannung": "7,2 Volt"
}

It's not important if the bool value is also surrounded by double quotes or not.

Upvotes: 1

Views: 73

Answers (2)

MoRe
MoRe

Reputation: 2372

dict(map(lambda x: x.split("="), [x.group() for x in re.finditer(r'[a-z_]+\=([A-Za-z ]+|([0-9]+(,[0-9]+)?))+', test, re.DOTALL)]))

output:

{'versandkostenfrei': 'Ja',
 'delivery_time': 'sofort lieferbar',
 'instantly_deliverable': 'true',
 'spannung': '7,2 Volt'}

Upvotes: 0

gre_gor
gre_gor

Reputation: 6810

Split the string with regex by keys with "=" (and possible prepended ","), with the key as a captured group.
This will create a list of alternating keys and values (first item will be an empty string).
Then you just collect them into key/value tuples to create a dict.

import re
test4 = "versandkostenfrei=Ja,delivery_time=sofort lieferbar,instantly_deliverable=true,spannung=7,2 Volt"
parts = re.split(",?([\w_]+)=", test4)
output = dict((parts[i], parts[i+1]) for i in range(1, len(parts), 2))
print(output)

This creates:

{
    'versandkostenfrei': 'Ja',
    'delivery_time': 'sofort lieferbar',
    'instantly_deliverable': 'true',
    'spannung': '7,2 Volt'
}

Upvotes: 2

Related Questions