user1356863
user1356863

Reputation: 143

convert file to python dict

Here is my file that I want to convert to a python dict:

#
# DATABASE 
#
Database name   FooFileName
Database file   FooDBFile
Info file       FooInfoFile
Database ID     3
Total entries   8888

I have tried several things and I can't get it to convert to a dict. I ultimately want to be able to pick off the 'Database file' as a string. Thanks in advance.

Here is what I have tried already and the errors:

    # ValueError: need more than 1 value to unpack
    #d = {}
    #for line in json_dump:
        #for k,v in [line.strip().split('\n')]:
    #    for k,v in [line.strip().split(None, 1)]:
    #        d[k] = v.strip()
    #print d
    #print d['Database file']


    # IndexError: list index out of range
    #d = {}
    #for line in json_dump:
    #    line = line.strip()
    #    parts = [p.strip() for p in line.split('/n')]
    #    d[parts[0]] = (parts[1], parts[2])
    #print d

Upvotes: 0

Views: 130

Answers (3)

rchang
rchang

Reputation: 5236

EDITED to reflect a line-wise regular expression approach.

Since it appears your file is not tab-delimited, you could use a regular expression to isolate the columns:

import re

#
# The rest of your code that loads up json_dump
#

d = {}
for line in json_dump:
    if line.startswith('#'): continue  ## For filtering out comment lines
    line = line.strip()

    #parts = [p.strip() for p in line.split('/n')]
    try:
        (key, value) = re.split(r'\s\s+', line)  ## Split the line of input using 2 or more consecutive white spaces as the delimiter
    except ValueError:  continue  ## Skip malformed lines

    #d[parts[0]] = (parts[1], parts[2])
    d[key] = value

print d

This yields this dictionary:

{'Database name': 'FooFileName', 'Total entries': '8888', 'Database ID': '3', 'Database file': 'FooDBFile', 'Info file': 'FooInfoFile'}

Which should allow you to isolate the individual values.

Upvotes: 0

ZdaR
ZdaR

Reputation: 22954

Actually when we split, it returns a list of 3 values in it , so we need 3 variables to store the returned results, now we combine the first and second value returned , separated by a space to act as a key whose value is the third value returned in the list , This may be the most simple approach but I guess it will get your job done and it is easy to understand as well

d = {}
for line in json_dump:
    if line.startswith('#'): continue
    for u,k,v in line.strip().split():
        d[u+" "+k] = v.strip()
print d
print d['Database file']

Upvotes: 0

Kasravnd
Kasravnd

Reputation: 107287

First you need to separate the string after last # . you can do it with regular expressions , re.search will do it :

>>> import re
>>> s="""#
... # DATABASE 
... #
... Database name   FooFileName
... Database file   FooDBFile
... Info file       FooInfoFile
... Database ID     3
... Total entries   8888"""

>>> re.search(r'#\n([^#]+)',s).group(1)
'Database name   FooFileName\nDatabase file   FooDBFile\nInfo file       FooInfoFile\nDatabase ID     3\nTotal entries   8888'

also in this case you can just use split , you can split the text with # then choose the last element :

>>> s2=s.split('#')[-1]

Then you can use a dictionary comprehension and list comprehension , note that re.split is a good choice for this case as it use r' {2,}' for split that match 2 or more space :

>>> {k:v for k,v in [re.split(r' {2,}',i) for i in s2.split('\n') if i]}
{'Database name': 'FooFileName', 'Total entries': '8888', 'Database ID': '3', 'Database file': 'FooDBFile', 'Info file': 'FooInfoFile'}

Upvotes: 1

Related Questions