How to extract certain data on different lines from file in python

Question

The code for events was this and it successfully worked.But i tried to modify it to ectract others and it's not working, obviously it's not correct.

with open('GroupEvent/G0.txt') as f:
lines = f.readlines()
for i in range(0, len(lines)):
    if lines[i] == '
':
        nlines = 0
    else:
        line = lines[i]
        entry=line.split()
        for x in entry:
            first_char=x
            EventToMatch = ('E')
            if first_char.startswith(EventToMatch) and nlines == 1 :
              Events.append(first_char)
              nlines = 2
              break
            elif nlines==2:
              Org.append(first_char)
              nlines= 3
              
            elif nlines == 3:
              Yes.append(first_char)
              nlines =4
              

            elif nlines == 4:
              No.append(first_char)
              nlines == 0
              
            else:
             break

okay so I have a file in which I have data like above, now the first line the id with E is the specific id of an event and on the second link it's the person id's who is organizing, while the 3rd line has the id of the person who accepted the invite and the fourth one is the one who rejected. The file has dozens of record like this separated by one empty line. How can I collect the data for organizer id, people who said yes and no? I easily captured the event id that's because it started with E and I got myself an array of event ids. Now I am having trouble extracting the others.

m.i.cosacak · Accepted Answer

I generally use a class if the file has a certain structure. For instance, as like in FastQ file. I put the following lines in input_file.txt and returns 5 lines. You can do whatever you want with it.

input_file.txt

E932 4 1240153200000 #id of an event
M48462 #id of organizer
M48462 #id of accepted invite
M65542 #id of rejected invite

E932 4 1240153200000
M48462
M48462
M65542

E932 4 1240153200000
M48462
M48462
M65542

E932 4 1240153200000
M48462
M48462
M65542

The class code to handle it:

class HandleFile:
    def __init__(self, filename):
        self.input = open(filename,"r") # assuming it is a textfile
        self.currentLine = 0
    def __iter__(self):
        return self
    def __next__(self):
        mylist = []
        for i in range(5): # as it is 5 lines for each
            line = self.input.readline()
            line = str(line)
            self.currentLine += 1
            if line:
                mylist.append(line.strip("
"))
            else:
                mylist.append(None) # add None if it is end of file
        if mylist.count(None) == 5: # check if it is the end of line
            raise StopIteration
        assert mylist[4] == "" # check if the 5th line is empty line
        assert mylist[0].startswith("E") # or put more condition
        return mylist

hf = HandleFile("input_file.txt")
for lst in hf:
    print(lst)

Here is the output:

...
['E932 4 1240153200000 #id of an event', 'M48462 #id of organizer', 'M48462 #id of accepted invite', 'M65542 #id of rejected invite', '']
['E932 4 1240153200000', 'M48462', 'M48462', 'M65542', '']
['E932 4 1240153200000', 'M48462', 'M48462', 'M65542', '']
['E932 4 1240153200000', 'M48462', 'M48462', 'M65542', '']
>>>

NOTE:this code has been modified from here

How to extract certain data on different lines from file in python

Answers (2)

Related Questions