Aizzaac
Aizzaac

Reputation: 3318

How to fill a dictionary from a list using regex?

I have a list ("output"). I want to extract values from it and put them in a dictionary. So far, I can extract some words using regex. But I do not know how to fill the dictionary.

THIS IS MY ATTEMPT

output = ['labels: imagenet_labels.txt \n', '\n', 'Model: efficientnet-edgetpu-S_quant_edgetpu.tflite \n', '\n', 'Image: img0000.jpg \n', '\n', '----INFERENCE TIME----\n', 'Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.\n', 'time: 6.0ms\n', '-------RESULTS--------\n','results: wall clock\n', 'score: 0.25781\n', '##################################### \n', ' \n', '\n']

mydict = {}

regex1 = re.compile(fr'(\w+:)\s(.*)')
match_regex1 = list(filter(regex1.match, output))
match = [line.rstrip('\n') for line in match_regex1]



THE DICTIONARY MUST LOOK LIKE THIS:

{
'Model': "efficientnet-edgetpu-S_quant_edgetpu.tflite",
'Image': "img0000.jpg",
'time': "6.0",
'results': "wall_clock",
'score': :0.25781"
}

THE LIST LOOKS LIKE THIS:

enter image description here

EDIT

I have make this loop. Althoug it does not work properly:

for i in output:
    reg1 = re.search(r'(\w+:)\s(.*)', i)
    if "Model" in i:
        mydict.setdefault("Model", {reg1.group()})
        print(mydict)

Upvotes: 0

Views: 102

Answers (4)

blhsing
blhsing

Reputation: 106523

Since the delimiter of the fields is always : is you can use the str.split method instead of regex for better efficiency:

dict(s.split(': ', 1) for s in map(str.rstrip, output) if ': ' in s)

Demo: https://repl.it/@blhsing/SnoopyBoringComputationalscience

Upvotes: 0

MrNobody33
MrNobody33

Reputation: 6483

You could try this, based on the list match:

import re
output = ['labels: imagenet_labels.txt \n', '\n', 'Model: efficientnet-edgetpu-S_quant_edgetpu.tflite \n', '\n', 'Image: img0000.jpg \n', '\n', '----INFERENCE TIME----\n', 'Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.\n', 'time: 6.0ms\n', '-------RESULTS--------\n','results: wall clock\n', 'score: 0.25781\n', '##################################### \n', ' \n', '\n']

mydict = {}

regex1 = re.compile(fr'(\w+:)\s(.*)')
match_regex1 = list(filter(regex1.match, output))
match = [line.rstrip('\n') for line in match_regex1]

features_wanted='ModelImagetimeresultsscore'

dct={i.replace(' ','').split(':')[0]:i.replace(' ','').split(':')[1] for i in match if i.replace(' ','').split(':')[0] in features_wanted}
mydict=dct
print(dct)

Output:

{'Model': 'efficientnet-edgetpu-S_quant_edgetpu.tflite', 'Image': 'img0000.jpg', 'time': '6.0ms', 'results': 'wallclock', 'score': '0.25781'}

Explain of dct: It's a Dictionary Comprehension and iterates over the list match, so here is an example of the iteration with 'Model: efficientnet-edgetpu-S_quant_edgetpu.tflite':

#First check if it is a feature wanted:
i='Model: efficientnet-edgetpu-S_quant_edgetpu.tflite'
i.replace(' ','')
>>>'Model:efficientnet-edgetpu-S_quant_edgetpu.tflite'
i.replace(' ','').split(':')
>>>['Model','efficientnet-edgetpu-S_quant_edgetpu.tflite']
i.replace(' ','').split(':')[0] in features_wanted  #'Model' in 'ModelImagetimeresultsscore'
>>>True
#If it is in features_wanted, an item like this is append to the dictionary:
i.replace(' ','').split(':')[0]:i.replace(' ','').split(':')[1]
>>>'Model':'efficientnet-edgetpu-S_quant_edgetpu.tflite'

Upvotes: 1

Andrej Kesely
Andrej Kesely

Reputation: 195428

output = ['labels: imagenet_labels.txt \n', '\n', 'Model: efficientnet-edgetpu-S_quant_edgetpu.tflite \n', '\n', 'Image: img0000.jpg \n', '\n', '----INFERENCE TIME----\n', 'Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.\n', 'time: 6.0ms\n', '-------RESULTS--------\n','results: wall clock\n', 'score: 0.25781\n', '##################################### \n', ' \n', '\n']

d = dict( re.findall(r'(\w+):\s*([^\n]+?)\s*$', ' '.join(output), flags=re.M) )

from pprint import pprint
pprint(d)

Prints:

{'Image': 'img0000.jpg',
 'Model': 'efficientnet-edgetpu-S_quant_edgetpu.tflite',
 'Note': 'The first inference on Edge TPU is slow because it includes loading '
         'the model into Edge TPU memory.',
 'labels': 'imagenet_labels.txt',
 'results': 'wall clock',
 'score': '0.25781',
 'time': '6.0ms'}

Upvotes: 1

Milad Yousefi
Milad Yousefi

Reputation: 309

for filling dicionary you can use this script:

for item in match:
    key , value = item.split(":")
    mydict[key] = value

so the result is something like:

{'labels': ' imagenet_labels.txt ', 'Model': ' efficientnet-edgetpu-S_quant_edgetpu.tflite ', 'Image': ' img0000.jpg ', 'Note': ' The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.', 'time': ' 6.0ms', 'results': ' wall clock', 'score': ' 0.25781'}

Upvotes: 1

Related Questions