sdgd
sdgd

Reputation: 733

Get data from a file based on certain text - Python

I have a data that comes in the below format in a file.

"Attach Listener" #7338 daemon prio=9 os_prio=0 tid=0x00007f51c0009000 nid=0x731c waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"lettuce-nioEventLoop-9-155" #362 daemon prio=5 os_prio=0 tid=0x00007f515000c800 nid=0x4f7c runnable [0x00007f50da85d000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
    - locked <0x0000000082af6f50> (a io.netty.channel.nio.SelectedSelectionKeySet)
    - locked <0x0000000082af8050> (a java.util.Collections$UnmodifiableSet)
    - locked <0x0000000082af7f78> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
    at io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:62)
    at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:753)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:409)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - None

"lettuce-nioEventLoop-9-154" #360 daemon prio=5 os_prio=0 tid=0x00007f51d00c3800 nid=0x4dd5 runnable [0x00007f50da45b000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
    - locked <0x0000000082afa8b0> (a io.netty.channel.nio.SelectedSelectionKeySet)
    - locked <0x0000000082afb9b0> (a java.util.Collections$UnmodifiableSet)
    - locked <0x0000000082afb8d8> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
    at io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:62)
    at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:753)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:409)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - None

"Attach Listener" #7338 daemon prio=9 os_prio=0 tid=0x00007f51c0009000 nid=0x731c waiting on condition [0x0000000000000000]
   java.lang.Thread.State: WAITING

   Locked ownable synchronizers:
    - None

I need to store the complete stack trace based on the first line of each stack. i.e. "lettuce-nioEventLoop-9-155" #362 daemon prio=5 os_prio=0 tid=0x00007f515000c800 nid=0x4f7c runnable [0x00007f50da85d000]

I need to collect the first lines of each trace and when I compare with the data i have to the data in the file and if it matches, i need to collect the complete trace of that. There may be cases where the first line might be same for some other stack trace in the same file, and if that is the case, i need to append it to the same already collected data previously.

This is what I've done -

data_methods = []
tdfilename = r"C:\Users\hello\Desktop\trace_test.txt"
with open(tdfilename) as f:
    for line in f:
        method = re.findall(r'"(.*?)]', line)
        fmethod = ''.join(method)
        if fmethod:
            data_methods.append("\""+fmethod+"]") # Adding " and ] at the start and end of the line as per the file content
f.close()

I'm collecting the first lines of all the stack traces into a list. My idea here is to compare this list data with the data in the file and if it matches, i need to collect the complete trace. I'm stuck on getting the logic for this.

Should i be using dict to save the first line as keys and the content as values as the firsts lines can occur multiple times with the same data?

How can i achieve this. I'm doing this to ease some of our work in our daily activities.

Upvotes: 0

Views: 73

Answers (2)

Serge Ballesta
Serge Ballesta

Reputation: 148975

A defaultdict can be handy when you want to create a new thing in a mapping and add to it if it already exists. Here I would just do:

data_methods = collections.defaultdict(list)
tdfilename = r"C:\Users\hello\Desktop\trace_test.txt"
firstpattern = re.compile(r'".*]\s*$')
with open(tdfilename) as f:
    for line in f:
        if firstpattern.match(line)
        cur = data_methods[line.strip()]
    else:
        cur.append(line)

You have then just to join the values, for example to dump the result:

for k, v in data_methods.items():
    print(k)
    print(''.join(v))

Upvotes: 1

laenkeio
laenkeio

Reputation: 512

It seems you can rely on the indentation in the trace format. So this is a basic version:

td_filename = 'trace.txt'

exc_dict = {}

with open(td_filename) as f:
    cur_line = None

    for line in f:
        if line.startswith(' ') or line.startswith('\n'):
            if cur_line is not None:
                exc_dict[cur_line].append(line)
        else:
            if line not in exc_dict:
                exc_dict[line] = []
            cur_line = line

for k in exc_dict:
    print(k)
    print(exc_dict[k])
    print('\n')

If you want to separate the individual exceptions and join the strings, try this:

td_filename = 'trace.txt'

exc_dict = {}

with open(td_filename) as f:
    cur_line = None

    for line in f:
        if line.startswith(' ') or line.startswith('\n'):
            if cur_line is not None:
                if exc_dict[cur_line][-1] is None:
                    exc_dict[cur_line][-1] = ''
                exc_dict[cur_line][-1] += line
        else:
            if line not in exc_dict:
                exc_dict[line] = []

            exc_dict[line].append(None)
            cur_line = line

for k in exc_dict:
    print(k)
    for e in exc_dict[k]:
        print(e)
        print('\n')

Upvotes: 0

Related Questions