Reputation: 143
Here is my file that I want to convert to a python dict:
#
# DATABASE
#
Database name FooFileName
Database file FooDBFile
Info file FooInfoFile
Database ID 3
Total entries 8888
I have tried several things and I can't get it to convert to a dict. I ultimately want to be able to pick off the 'Database file' as a string. Thanks in advance.
Here is what I have tried already and the errors:
# ValueError: need more than 1 value to unpack
#d = {}
#for line in json_dump:
#for k,v in [line.strip().split('\n')]:
# for k,v in [line.strip().split(None, 1)]:
# d[k] = v.strip()
#print d
#print d['Database file']
# IndexError: list index out of range
#d = {}
#for line in json_dump:
# line = line.strip()
# parts = [p.strip() for p in line.split('/n')]
# d[parts[0]] = (parts[1], parts[2])
#print d
Upvotes: 0
Views: 130
Reputation: 5236
EDITED to reflect a line-wise regular expression approach.
Since it appears your file is not tab-delimited, you could use a regular expression to isolate the columns:
import re
#
# The rest of your code that loads up json_dump
#
d = {}
for line in json_dump:
if line.startswith('#'): continue ## For filtering out comment lines
line = line.strip()
#parts = [p.strip() for p in line.split('/n')]
try:
(key, value) = re.split(r'\s\s+', line) ## Split the line of input using 2 or more consecutive white spaces as the delimiter
except ValueError: continue ## Skip malformed lines
#d[parts[0]] = (parts[1], parts[2])
d[key] = value
print d
This yields this dictionary:
{'Database name': 'FooFileName', 'Total entries': '8888', 'Database ID': '3', 'Database file': 'FooDBFile', 'Info file': 'FooInfoFile'}
Which should allow you to isolate the individual values.
Upvotes: 0
Reputation: 22954
Actually when we split, it returns a list of 3 values in it , so we need 3 variables to store the returned results, now we combine the first and second value returned , separated by a space
to act as a key whose value is the third value returned in the list , This may be the most simple approach but I guess it will get your job done and it is easy to understand as well
d = {}
for line in json_dump:
if line.startswith('#'): continue
for u,k,v in line.strip().split():
d[u+" "+k] = v.strip()
print d
print d['Database file']
Upvotes: 0
Reputation: 107287
First you need to separate the string after last #
. you can do it with regular expressions , re.search
will do it :
>>> import re
>>> s="""#
... # DATABASE
... #
... Database name FooFileName
... Database file FooDBFile
... Info file FooInfoFile
... Database ID 3
... Total entries 8888"""
>>> re.search(r'#\n([^#]+)',s).group(1)
'Database name FooFileName\nDatabase file FooDBFile\nInfo file FooInfoFile\nDatabase ID 3\nTotal entries 8888'
also in this case you can just use split
, you can split the text with #
then choose the last element :
>>> s2=s.split('#')[-1]
Then you can use a dictionary comprehension
and list comprehension , note that re.split
is a good choice for this case as it use r' {2,}'
for split that match 2 or more space :
>>> {k:v for k,v in [re.split(r' {2,}',i) for i in s2.split('\n') if i]}
{'Database name': 'FooFileName', 'Total entries': '8888', 'Database ID': '3', 'Database file': 'FooDBFile', 'Info file': 'FooInfoFile'}
Upvotes: 1