user9279273
user9279273

Reputation: 105

Converting text file to list of dictionaries

I have written a script to convert a text file into dictionary..

script.py

l=[]
d={}
count=0

f=open('/home/asha/Desktop/test.txt','r')
for row in f:
    rowcount+=1
    if row[0] == ' ' in row:
        l.append(row)
    else:
        if count == 0:
            temp = row
            count+=1
        else:
            d[temp]=l
            l=[]
            count=0              
print d

textfile.txt

Time
 NtGetTickCount
 NtQueryPerformanceCounter
 NtQuerySystemTime
 NtQueryTimerResolution
 NtSetSystemTime
 NtSetTimerResolution
 RtlTimeFieldsToTime
 RtlTimeToTime
System informations
 NtQuerySystemInformation
 NtSetSystemInformation
 Enumerations
 Structures

The output i have got is

{'Time\n': [' NtGetTickCount\n', ' NtQueryPerformanceCounter\n', ' NtQuerySystemTime\n', ' NtQueryTimerResolution\n', ' NtSetSystemTime\n', ' NtSetTimerResolution\n', ' RtlTimeFieldsToTime\n', ' RtlTimeToTime\n']}

Able to convert upto 9th line in the text file. Suggest me where I am going wrong..

Upvotes: 0

Views: 179

Answers (6)

Aaditya Ura
Aaditya Ura

Reputation: 12689

Just keep track the line which start with ' ' and you are done with one loop only :

final=[]
keys=[]
flag=True
with open('new_text.txt','r') as f:
    data = []

    for line in f:
        if not line.startswith(' '):
            if line.strip():
                keys.append(line.strip())
            flag=False
            if data:
                final.append(data)
            data=[]
            flag=True
        else:
            if flag==True:
                data.append(line.strip())

final.append(data)
print(dict(zip(keys,final)))

output:

{'Example': ['data1', 'data2'], 'Time': ['NtGetTickCount', 'NtQueryPerformanceCounter', 'NtQuerySystemTime', 'NtQueryTimerResolution', 'NtSetSystemTime', 'NtSetTimerResolution', 'RtlTimeFieldsToTime', 'RtlTimeToTime'], 'System informations': ['NtQuerySystemInformation', 'NtSetSystemInformation', 'Enumerations', 'Structures']}

Upvotes: 0

JahKnows
JahKnows

Reputation: 2706

Just for the sake of adding in my 2 cents.

This problem is easier to tackle backwards. Consider iterating through your file backwards and then storing the values into a dictionary whenever a header is reached.

f=open('test.txt','r')

d = {}
l = []
for row in reversed(f.read().split('\n')):
    if row[0] == ' ': 
        l.append(row)
    else:
        d.update({row: l})
        l = []

Upvotes: 0

Hamatti
Hamatti

Reputation: 1220

So you need to know two things at any given time while looping over the file:

1) Are we on a title level or content level (by indentation) and

2) What is the current title

In the following code, we first check if the current line we are at, is a title (so it does not start with a space) and set the currentTitle to that as well as insert that into our dictionary as a key and an empty list as a value.

If it is not a title, we just append to corresponding title's list.

with open('49359186.txt', 'r') as input:

    topics = {}
    currentTitle = ''

    for line in input:
        line = line.rstrip()
        if line[0] != ' ':
            currentTitle = line
            topics[currentTitle] = []
        else:
            topics[currentTitle].append(line)

print topics

Upvotes: 0

sciroccorics
sciroccorics

Reputation: 2427

Try this:

d = {}
key = None

with open('/home/asha/Desktop/test.txt','r') as file:
    for line in file:
        if line.startswith(' '):
            d[key].append(line.strip())
        else:
            key = line.strip(); d[key] = []

print(d)

Upvotes: 0

Keyur Potdar
Keyur Potdar

Reputation: 7248

Using dict.setdefault to create dictionary with lists as values will make your job easier.

d = {}

with open('input.txt') as f:
    key = ''
    for row in f:
        if row.startswith(' '):
            d.setdefault(key, []).append(row.strip())
        else:
            key = row

print(d)

Output:

{'Time\n': ['NtGetTickCount', 'NtQueryPerformanceCounter', 'NtQuerySystemTime', 'NtQueryTimerResolution', 'NtSetSystemTime', 'NtSetTimerResolution', 'RtlTimeFieldsToTime', 'RtlTimeToTime'], 'System informations\n': ['NtQuerySystemInformation', 'NtSetSystemInformation', 'Enumerations', 'Structures']}

A few things to note here:

  1. Always use with open(...) for file operations.
  2. If you want to check the first index, or the first few indices, use str.startswith()

The same can be done using collections.defaultdict:

from collections import defaultdict

d = defaultdict(list)

with open('input.txt') as f:
    key = ''
    for row in f:
        if row.startswith(' '):
            d[key].append(row)
        else:
            key = row

Upvotes: 1

phihag
phihag

Reputation: 288290

You never commit (i.e. run d[row] = []) the final list to the dictionary.

You can simply commit when you create the row:

d = {}
cur = []

for row in f:
    if row[0] == ' ':  # line in section
        cur.append(row)
    else:  # new row
        d[row] = cur = []

print (d)

Upvotes: 1

Related Questions