imfromsweden
imfromsweden

Reputation: 169

Python - Importing strings into a list, into another list :)

Basically I want to read strings from a text file, put them in lists three by three, and then put all those three by three lists into another list. Actually let me explain it better :)

Text file (just an example, I can structure it however I want):

party    
sleep  
study    
--------   
party   
sleep  
sleep    
-----   
study  
sleep  
party   
---------

etc

From this, I want Python to create a list that looks like this:

List1 = [['party','sleep','study'],['party','sleep','sleep'],['study','sleep','party']etc]

But it's super hard. I was experimenting with something like:

test2 = open('test2.txt','r')
List=[]

for line in 'test2.txt':
    a = test2.readline()
    a = a.replace("\n","")
    List.append(a)
    print(List)

But this just does horrible horrible things. How to achieve this?

Upvotes: 1

Views: 171

Answers (3)

smeso
smeso

Reputation: 4295

You can try with this:

res = []
tmp = []

for i, line in enumerate(open('file.txt'), 1):
    tmp.append(line.strip())
    if i % 3 == 0:
        res.append(tmp)
        tmp = []

print(res)

I've assumed that you don't have the dashes.

Edit:

Here is an example for when you have dashes:

res = []
tmp = []

for i, line in enumerate(open('file.txt')):
    if i % 4 == 0:
        res.append(tmp)
        tmp = []
        continue
    tmp.append(line.strip())

print(res)

Upvotes: 3

Abhijit
Abhijit

Reputation: 63727

If you want to group the data in size of 3. Assumes your data in the text file is not grouped by any separator.

You need to read the file, sequentially and create a list. To group it you can use any of the known grouper algorithms

from itertools import izip, imap
with open("test.txt") as fin:
    data = list(imap(list, izip(*[imap(str.strip, fin)]*3)))

pprint.pprint(data)
[['party', 'sleep', 'study'],
 ['party', 'sleep', 'sleep'],
 ['study', 'sleep', 'party']]

Steps of Execution

  1. Create a Context Manager with the file object.
  2. Strip each line. (Remove newline)
  3. Using zip on the iterator list of size 3, ensures that the items are grouped as tuples of three items
  4. Convert tuples to list
  5. Convert the generator expression to a list.

Considering all are generator expressions, its done on a single iteration.

Instead, if your data is separated and grouped by a delimiter ------ you can use the itertools.groupby solution

from itertools import imap, groupby
class Key(object):
    def __init__(self, sep):
        self.sep = sep
        self.count = 0
    def __call__(self, line):
        if line == self.sep:    self.count += 1
        return self.count


with open("test.txt") as fin:
    data = [[e for e in v if "----------" not in e]
        for k, v in groupby(imap(str.strip, fin), key = Key("----------"))]


pprint.pprint(data)
[['party', 'sleep', 'study'],
 ['party', 'sleep', 'sleep'],
 ['study', 'sleep', 'party']]

Steps of Execution

  1. Create a Key Class, to increase a counter when ever the separator is encountered. The function call spits out the counter every-time its called apart from conditionally increasing it.
  2. Create a Context Manager with the file object.
  3. Strip each line. (Remove newline)
  4. Group the data using itertools.groupby and using your custom key
  5. Remove the separator from the grouped data and create a list of the groups.

Upvotes: 4

jonrsharpe
jonrsharpe

Reputation: 122036

First big problem:

for line in 'test2.txt':

gives you

't', 'e', 's', 't', '2', '.', 't', 'x', 't'

You need to loop through the file you open:

for line in test2:

Or, better:

with open("test2.txt", 'r') as f:
    for line in f:

Next, you need to do one of two things:

  1. If the line contains "-----", create a new sub-list (myList.append([]))
  2. Otherwise, append the line to the last sub-list in your list (myList[-1].append(line))

Finally, your print at the end should not be so far indented; currently, it prints for every line, rather than just when the processing is complete.

    List.append(a)
print(List)

Perhaps a better structure for your file would be:

party,sleep,study
party,sleep,sleep
...

Now each line is a sub-list:

for line in f:
    myList.append(line.split(','))

Upvotes: 0

Related Questions