Reputation: 15
I need a text after a keyword
My text file like this. I am trying to extract the car info, car model, and append in to the python dict.
# car name
BMW
suzuki
# car model
X1
TT
# color
red
blue
My code:
keywords = [car_name,car_model,color]
parsed_content = {}
def car_info(text):
content = {}
indices = []
keys = []
for key in Keywords:
try:
content[key] = text[text.index(key) + len(key):]
indices.append(text.index(key))
keys.append(key)
except:
pass
zipped_lists = zip(indices, keys)
sorted_pairs = sorted(zipped_lists)
# sorted_pairs
tuples = zip(*sorted_pairs)
indices, keys = [ list(tuple) for tuple in tuples]
# return keys
print(keys)
content = []
for idx in range(len(indices)):
if idx != len(indices)-1:
content.append(text[indices[idx]: indices[idx+1]])
else:
content.append(text[indices[idx]: ])
for i in range(len(indices)):
parsed_content[keys[i]] = content[i]
return parsed_content
my output is
parsed_content = {car_name : car_name BMW SUZUKI,
car_model : car_model x1 tt,
color : color red blue
}
Expected output:
{'car_name': ['bmw', 'suzuki'],
'car_model': ['x1', 'TT'],
'color': ['red', 'blue']
}
Upvotes: 1
Views: 132
Reputation: 585
using index
txt = '''# car name
BMW
suzuki
# car model
X1
TT
# color
red
blue'''
txt_split = [i.strip() for i in txt.split('\n') if i.strip()]
header_list = ['# car name', '# car model', '# color']
headers = [txt_split.index(i) for i in txt_split if i in header_list]
for i in range(len(headers)):
values = txt_split[headers[i]].partition('#')[2].strip().replace(' ', '_')
if i == len(headers) - 1:
keys = txt_split[headers[i] + 1:]
else:
keys = txt_split[headers[i] + 1:headers[i + 1]]
print({values: keys})
>>>> {'car_name': ['BMW', 'suzuki']}
>>>> {'car_model': ['X1', 'TT']}
>>>> {'color': ['red', 'blue']}
Upvotes: 0
Reputation: 189936
Your attempt seems extraordinarily overcomplicated. Aren't you simply looking for something like this?
from collections import defaultdict
def car_info(filename):
with open(filename) as lines:
values = defaultdict(list)
for line in lines:
line = line.strip()
if not line:
continue
elif line.startswith("# "):
keyword = line[2:]
else:
values[keyword].append(line.rstrip('\n'))
return values
Of course, the proper solution to your problem is to use a text format which Python already knows how to read. The data in YAML format could look like
---
items:
car name:
- BMW
- Suzuki
car model:
- X1
- TT
color:
- red
- blue
though if the items are related by index, a representation which would make more sense would be
---
cars:
car:
- name: BMW
- model: X1
- color: red
car:
- name: Suzuki
- model: TT
- color: blue
Upvotes: 2