Reputation: 73
I have an initial code like this:
record = "Jane,Doe,25/02/2002;
James,Poe,19/03/1998;
Max,Soe,16/12/2001
..."
I need to make it into a dictionary and its output should be something like this:
{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'}
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'}
...
Each line should have an incrementing key starting from 1.
I currently have no idea to approach this issue as I am still a student with no prior experience.
I have seen people use this for strings containing key-value pairs but my string does not contain those:
mydict = dict((k.strip(), v.strip()) for k,v in
(item.split('-') for item in record.split(',')))
Upvotes: 7
Views: 327
Reputation: 34046
Use split
:
In [220]: ans = []
In [221]: record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
In [223]: l = record.split(';')
In [227]: for i in l:
...: l1 = i.split(',')
...: d = {'First Name': l1[0], 'Last Name': l1[1], 'Birthday': l1[2]}
...: ans.append(d)
...:
In [228]: ans
Out[228]:
[{'First Name': 'Jane', 'Last Name': 'Doe', 'Birthday': '25/02/2002'},
{'First Name': 'James', 'Last Name': 'Poe', 'Birthday': '19/03/1998'},
{'First Name': 'Max', 'Last Name': 'Soe', 'Birthday': '16/12/2001'}]
Upvotes: 6
Reputation: 59
With some regex:
import re
[re.match(r'(?P<First_name>\w+),(?P<Last_name>\w+),(?P<Birthday>.+)', r).groupdict() for r in record.split(';')]
The underscores in First_name and Last_name are inevitable unfortunately.
Upvotes: 0
Reputation: 2117
list comprehensions
make it easy.
record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
list_of_records = [item.split(',') for item in record.split(';')]
dict_of_records = [{'first_name':line[0], 'last_name':line[1], 'Birthday':line[2]} for line in list_of_records]
print(dict_of_records)
Output:
[{'first_name': 'Jane', 'last_name': 'Doe', 'Birthday': '25/02/2002'}, {'first_name': 'James', 'last_name': 'Poe', 'Birthday': '19/03/1998'}, {'first_name': 'Max', 'last_name': 'Soe', 'Birthday': '16/12/2001'}]
Upvotes: 1
Reputation: 136
You can do it without writing any loops using sub() method of re and json:
import re
import json
record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
sub_record = re.sub(r'\b;?([a-zA-Z]+),([a-zA-Z]+),(\d\d/\d\d/\d\d\d\d)',r',{"First name": "\1", "Last name": "\2", "Birthday": "\3"}',record)
mydict = json.loads('['+sub_record[1:]+']')
print(mydict)
Output:
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'}, {'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'}, {'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]
Upvotes: 0
Reputation: 2398
To make the required dictionary for a single line, you can use split
to chop up the line where there are commas (','), to get the values for the dictionary, and hard-code the keys. E.g.
line = "Jane,Doe,25/02/2002"
values = line.split(",")
d = {"First Name": values[0], "Last Name": values[1], "Birthday": values[2]}
Now to repeat that for each line in the record, a list of all the lines is needed. Again, you can use split
in this case to chop up the input where there are semicolons (';'). E.g.
record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
lines = record.split(";")
Now you can iterate the solution for one line over this lines
list, collecting the results into another list.
results = []
for line in lines:
values = line.split(",")
results.append({"First Name": values[0], "Last Name": values[1], "Birthday": values[2]})
The incremental key requirement you mention seems strange, because you could just keep them in a list, where the index in the list is effectively the key. But of course, if you really need the indexed-dictionary thing, you can use a dictionary comprehension to do that.
results = {i + 1: results[i] for i in range(len(results))}
Finally, the whole thing might be made more concise (and nicer IMO) by using a combination of list and dictionary comprehensions, as well as a list of your expected keys.
record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
keys = ["First Name", "Last Name", "Birthday"]
results = [dict(zip(keys, line.split(","))) for line in record.split(";")]
With the optional indexed-dictionary thingy:
results = {i + 1: results[i] for i in range(len(results))}
Upvotes: 4
Reputation: 5372
Here is a step by step Pythonic way to achieve that:
>>> from pprint import pprint # just to have a fancy print
>>> columns = ['First name', 'Last name', 'Birthday']
>>> records = '''Jane,Doe,25/02/2002
... James,Poe,19/03/1998
... Max,Soe,16/12/2001'''
>>> records = records.split()
>>> pprint(records)
['Jane,Doe,25/02/2002',
'James,Poe,19/03/1998',
'Max,Soe,16/12/2001']
>>> records = [_.split(',') for _ in records]
>>> pprint(records)
[['Jane', 'Doe', '25/02/2002'],
['James', 'Poe', '19/03/1998'],
['Max', 'Soe', '16/12/2001']]
>>> records = [dict(zip(columns, _)) for _ in records]
>>> pprint(records)
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
{'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]
If you have all records in one line, delimited by a ;
signal, then you can do this:
>>> from pprint import pprint # just to have a fancy print
>>> columns = ['First name', 'Last name', 'Birthday']
>>> records = 'Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001'
>>> records = records.split(';')
>>> pprint(records)
['Jane,Doe,25/02/2002',
'James,Poe,19/03/1998',
'Max,Soe,16/12/2001']
>>> records = [_.split(',') for _ in records]
>>> pprint(records)
[['Jane', 'Doe', '25/02/2002'],
['James', 'Poe', '19/03/1998'],
['Max', 'Soe', '16/12/2001']]
>>> records = [dict(zip(columns, _)) for _ in records]
>>> pprint(records)
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
{'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]
And finally you can put it all together in one line:
>>> from pprint import pprint # just to have a fancy print
>>> columns = ['First name', 'Last name', 'Birthday']
>>> records = 'Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001'
>>> # All tasks in one line now
>>> records = [dict(zip(columns, _)) for _ in [_.split(',') for _ in records.split(';')]]
>>> pprint(records)
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
{'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]
Upvotes: 2
Reputation: 43
# raw string data
record = 'Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001'
# list of lists
list_of_lists = [x.split(',') for x in record.split(';')]
# list of dicts
list_of_dicts = []
for x in list_of_lists:
# assemble into dict
d = {'First name': x[0],
'Last name': x[1],
'Birthday': x[2]}
# append to list
list_of_dicts.append(d)
output:
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
{'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]
Upvotes: 3
Reputation: 14228
I think you are looking for :
record = """Jane,Doe,25/02/2002;
James,Poe,19/03/1998;
Max,Soe,16/12/2001"""
num = 0
out = dict()
for v in record.split(";"):
v = v.strip().split(",")
num += 1
out[num] = {'First name':v[0],'Last name':v[1], 'Birthday':v[2]}
print(out)
prints:
{1: {'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
2: {'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
3: {'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}}
Upvotes: 3
Reputation: 651
other answers are already quite clear, just want to add on that, you can do it in one line (which is much less readable, not recommended, but it is arguably fancier). it also takes possible spaces into account with strip(), you can remove them if you don't want them. this gives you a list of dicts you need
record_dict = [{'First name': val[0].strip(), 'Last name': val[1].strip(), 'Birthday': val[2].strip()} for val in (rec.strip().split(',') for rec in record.strip().split(';'))]
Upvotes: 3
Reputation: 2189
The .split()
method is useful. First split the strings separated by ;
and split each of the new strings by ,
.
record = """Jane,Doe,25/02/2002;
James,Poe,19/03/1998;
Max,Soe,16/12/2001"""
out = []
for rec in record.split(';'):
lst = rec.strip().split(',')
dict_new = {}
dict_new['First Name'] = lst[0]
dict_new['Last Name'] = lst[1]
dict_new['Birthday'] = lst[2]
out.append(dict_new)
print(out)
Upvotes: 3
Reputation: 782
This should work for your case:
lines = [line.replace('\n','').replace('.','').strip() for line in record.split(';')]
desired_dict = {}
for i, line in enumerate(lines):
words = line.split(',')
desired_dict[i] = {
'First name':words[0],
'Last name':words[1],
'Birthday':words[2]
}
Upvotes: 3