Reputation: 73

Convert lines of string to a dictionary

I have an initial code like this:

record = "Jane,Doe,25/02/2002;
          James,Poe,19/03/1998;
          Max,Soe,16/12/2001
          ..."

I need to make it into a dictionary and its output should be something like this:

{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'}
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'}
...

Each line should have an incrementing key starting from 1.

I currently have no idea to approach this issue as I am still a student with no prior experience.

I have seen people use this for strings containing key-value pairs but my string does not contain those:

mydict = dict((k.strip(), v.strip()) for k,v in 
              (item.split('-') for item in record.split(',')))

Upvotes: 7

Answers (11)

Mayank Porwal

Reputation: 34046

Use split:

In [220]: ans = []

In [221]: record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
In [223]: l = record.split(';')

In [227]: for i in l:
     ...:     l1 = i.split(',')
     ...:     d = {'First Name': l1[0], 'Last Name': l1[1], 'Birthday': l1[2]}
     ...:     ans.append(d)
     ...: 

In [228]: ans
Out[228]: 
[{'First Name': 'Jane', 'Last Name': 'Doe', 'Birthday': '25/02/2002'},
 {'First Name': 'James', 'Last Name': 'Poe', 'Birthday': '19/03/1998'},
 {'First Name': 'Max', 'Last Name': 'Soe', 'Birthday': '16/12/2001'}]

Upvotes: 6

zozo128

Reputation: 59

With some regex:

import re

[re.match(r'(?P<First_name>\w+),(?P<Last_name>\w+),(?P<Birthday>.+)', r).groupdict() for r in record.split(';')]

The underscores in First_name and Last_name are inevitable unfortunately.

Upvotes: 0

Chandella07

Reputation: 2117

list comprehensions make it easy.

record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"

list_of_records = [item.split(',') for item in record.split(';')]
dict_of_records = [{'first_name':line[0], 'last_name':line[1], 'Birthday':line[2]} for line in list_of_records]

print(dict_of_records)

Output:

[{'first_name': 'Jane', 'last_name': 'Doe', 'Birthday': '25/02/2002'}, {'first_name': 'James', 'last_name': 'Poe', 'Birthday': '19/03/1998'}, {'first_name': 'Max', 'last_name': 'Soe', 'Birthday': '16/12/2001'}]

Upvotes: 1

avasuilia

Reputation: 136

You can do it without writing any loops using sub() method of re and json:

import re
import json
record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
sub_record = re.sub(r'\b;?([a-zA-Z]+),([a-zA-Z]+),(\d\d/\d\d/\d\d\d\d)',r',{"First name": "\1", "Last name": "\2", "Birthday": "\3"}',record)
mydict = json.loads('['+sub_record[1:]+']')
print(mydict)

Output:

[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'}, {'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'}, {'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]

Upvotes: 0

L.Grozinger

Reputation: 2398

To make the required dictionary for a single line, you can use split to chop up the line where there are commas (','), to get the values for the dictionary, and hard-code the keys. E.g.

line   = "Jane,Doe,25/02/2002"
values = line.split(",")
d = {"First Name": values[0], "Last Name": values[1], "Birthday": values[2]}

Now to repeat that for each line in the record, a list of all the lines is needed. Again, you can use split in this case to chop up the input where there are semicolons (';'). E.g.

record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
lines = record.split(";")

Now you can iterate the solution for one line over this lines list, collecting the results into another list.

results = []
for line in lines:
  values = line.split(",")
  results.append({"First Name": values[0], "Last Name": values[1], "Birthday": values[2]})

The incremental key requirement you mention seems strange, because you could just keep them in a list, where the index in the list is effectively the key. But of course, if you really need the indexed-dictionary thing, you can use a dictionary comprehension to do that.

results = {i + 1: results[i] for i in range(len(results))}

Finally, the whole thing might be made more concise (and nicer IMO) by using a combination of list and dictionary comprehensions, as well as a list of your expected keys.

record  = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
keys    = ["First Name", "Last Name", "Birthday"]
results = [dict(zip(keys, line.split(","))) for line in record.split(";")]

With the optional indexed-dictionary thingy:

results = {i + 1: results[i] for i in range(len(results))}

Upvotes: 4

accdias

Reputation: 5372

Here is a step by step Pythonic way to achieve that:

>>> from pprint import pprint # just to have a fancy print
>>> columns = ['First name', 'Last name', 'Birthday']
>>> records = '''Jane,Doe,25/02/2002
...           James,Poe,19/03/1998
...           Max,Soe,16/12/2001'''

>>> records = records.split()
>>> pprint(records)
['Jane,Doe,25/02/2002',
 'James,Poe,19/03/1998',
 'Max,Soe,16/12/2001']

>>> records = [_.split(',') for _ in records]
>>> pprint(records)
[['Jane', 'Doe', '25/02/2002'],
 ['James', 'Poe', '19/03/1998'],
 ['Max', 'Soe', '16/12/2001']]

>>> records = [dict(zip(columns, _)) for _ in records]
>>> pprint(records)
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
 {'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
 {'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]

If you have all records in one line, delimited by a ; signal, then you can do this:

>>> from pprint import pprint # just to have a fancy print
>>> columns = ['First name', 'Last name', 'Birthday']
>>> records = 'Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001'

>>> records = records.split(';')
>>> pprint(records)
['Jane,Doe,25/02/2002',
 'James,Poe,19/03/1998',
 'Max,Soe,16/12/2001']

>>> records = [_.split(',') for _ in records]
>>> pprint(records)
[['Jane', 'Doe', '25/02/2002'],
 ['James', 'Poe', '19/03/1998'],
 ['Max', 'Soe', '16/12/2001']]

>>> records = [dict(zip(columns, _)) for _ in records]
>>> pprint(records)
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
 {'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
 {'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]

And finally you can put it all together in one line:

>>> from pprint import pprint # just to have a fancy print
>>> columns = ['First name', 'Last name', 'Birthday']
>>> records = 'Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001'

>>> # All tasks in one line now
>>> records = [dict(zip(columns, _)) for _ in [_.split(',') for _ in records.split(';')]]

>>> pprint(records)
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
 {'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
 {'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]

Upvotes: 2

jackt247

Reputation: 43

# raw string data
record = 'Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001'
            
# list of lists
list_of_lists = [x.split(',') for x in record.split(';')]

# list of dicts
list_of_dicts = []
for x in list_of_lists:
    # assemble into dict
    d = {'First name': x[0],
         'Last name': x[1],
         'Birthday': x[2]}

    # append to list
    list_of_dicts.append(d)

output:

[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
 {'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
 {'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]

Upvotes: 3

SomeDude

Reputation: 14228

I think you are looking for :

record = """Jane,Doe,25/02/2002;
          James,Poe,19/03/1998;
          Max,Soe,16/12/2001"""

num = 0
out = dict()
for v in record.split(";"):
  v = v.strip().split(",")
  num += 1
  out[num] = {'First name':v[0],'Last name':v[1], 'Birthday':v[2]}
print(out)

prints:

{1: {'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
 2: {'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'}, 
 3: {'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}}

Upvotes: 3

seermer

Reputation: 651

other answers are already quite clear, just want to add on that, you can do it in one line (which is much less readable, not recommended, but it is arguably fancier). it also takes possible spaces into account with strip(), you can remove them if you don't want them. this gives you a list of dicts you need

record_dict = [{'First name': val[0].strip(), 'Last name': val[1].strip(), 'Birthday': val[2].strip()} for val in (rec.strip().split(',') for rec in record.strip().split(';'))]

Upvotes: 3

Sid

Reputation: 2189

The .split() method is useful. First split the strings separated by ; and split each of the new strings by ,.

record = """Jane,Doe,25/02/2002;
James,Poe,19/03/1998;
Max,Soe,16/12/2001"""
out = []
for rec in record.split(';'):
    lst = rec.strip().split(',')
    dict_new = {}
    dict_new['First Name'] = lst[0]
    dict_new['Last Name'] = lst[1]
    dict_new['Birthday'] = lst[2]
    out.append(dict_new)
print(out)

Upvotes: 3

GabrielP

Reputation: 782

This should work for your case:

lines = [line.replace('\n','').replace('.','').strip() for line in record.split(';')]
desired_dict = {}
for i, line in enumerate(lines):
  words = line.split(',')
  desired_dict[i] = {
      'First name':words[0],
      'Last name':words[1],
      'Birthday':words[2]
  }

Upvotes: 3

Convert lines of string to a dictionary

Answers (11)

Related Questions