Oliver Wilken
Oliver Wilken

Reputation: 2714

python regex: create dictionary from string

I have a sting containing multiple informations which I want to save in a dictionary:

s1 = "10:12:01    R1 3    E44"
s2 = "11:11:01    R100    E400"

pattern = "\d{2}:\d{2}:\d{2}(\,\d+)?" + \
          " +" + \
          "[0-9A-Za-z _]{2}([0-9A-Za-z _]{1})?([0-9A-Za-z _]{1})?" + \
          " +" + \
          "[0-9A-Za-z _]{2}([0-9A-Za-z _]{1})?([0-9A-Za-z _]{1})?$"

# --> 

d1 = {"time" : "10:12:01",
      "id1" : "R1 3", 
      "id2" : "E44"}

d2 = {"time" : "11:11:01",
      "id1" : "R100", 
      "id2" : "E400"}

is there a way doing this directly with python re?

Note: I'm aware that there is a similar question here: regex expression string dictionary python, however the formulation is not precisly pointing to what I expact as answer.

Upvotes: 0

Views: 1202

Answers (2)

Devesh Kumar Singh
Devesh Kumar Singh

Reputation: 20490

If the information is cleanly divided by whitespaces, why not use that information to split the string by whitespace and create the resultant list of dictionaries.
If we have multiple whitespaces, we can ignore those whitespaces while splitting using re.split

import re

#List of strings
li = [ "10:12:01    R1 3    E44", "11:11:01    R100    E400"]

#List of kyes
keys = ['time', 'id1', 'id2']

#Create the dictionary from keys from keys listand values obtained by splitting string on 2 or more whitespaces
result = [{keys[idx]:re.split(r'\s{2,}', s)[idx] for idx in range(len(keys))} for s in li]

print(result)

The output will be

[
{'time': '10:12:01', 'id1': 'R1 3', 'id2': 'E44'}, 
{'time': '11:11:01', 'id1': 'R100', 'id2': 'E400'}
]

Upvotes: 1

user459872
user459872

Reputation: 24582

>>> import re
>>> pattern = "(?P<time>\d{2}:\d{2}:\d{2}(\,\d+)?) +(?P<id1>[0-9A-Za-z_]{2}([0-9A-Za-z1-9_]{1})?([0-9A-Za-z_]{1})?) +(?P<id2>[0-9A-Za-z_]{2}([0-9A-Za-z1-9_]{1})?([0-9A-Za-z_]{1})?$)"
>>>
>>> s1 = "10:12:01    R123    E44"
>>> print(re.match(pattern, s1).groupdict())
{'time': '10:12:01', 'id1': 'R123', 'id2': 'E44'}

Upvotes: 1

Related Questions