Reputation: 1825
I have a dataset like below:
Process: matts.exe Pid: 900 Address: 0x7f6a0000
Vad Tag: Vad Protection: PAGE_EXECUTE_READWRITE
Flags: Protection: 6
0x7f6a0000 c8 00 00 00 58 01 00 00 ff ee ff ee 08 70 00 00 ....X........p..
0x7f6a0010 08 00 00 00 00 fe 00 00 00 00 10 00 00 20 00 00 ................
0x7f6a0020 00 02 00 00 00 20 00 00 8d 01 00 00 ff ef fd 7f ................
0x7f6a0030 03 00 08 06 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x7f6a0000 c8000000 ENTER 0x0, 0x0
0x7f6a0004 58 POP EAX
0x7f6a0005 0100 ADD [EAX], EAX
0x7f6a0007 00ff ADD BH, BH
Process: matts2.exe Pid: 910 Address: 0x7f6a0000
Vad Tag: Vad Protection: PAGE_EXECUTE_READWRITE
Flags: Protection: 6
0x7f6a0000 c8 00 00 00 58 01 00 00 ff ee ff ee 08 70 00 00 ....X........p..
0x7f6a0010 08 00 00 00 00 fe 00 00 00 00 10 00 00 20 00 00 ................
0x7f6a0020 00 02 00 00 00 20 00 00 8d 01 00 00 ff ef fd 7f ................
0x7f6a0030 03 00 08 06 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x7f6a0000 c8000000 ENTER 0x0, 0x0
0x7f6a0004 58 POP EAX
0x7f6a0005 0100 ADD [EAX], EAX
0x7f6a0007 00ff ADD BH, BH
How can I place this data into a pandas dataframe like below?
Process Pid Address Vad_Tag Protection Protection Hex_out Assembly_Out
matts.exe 900 0x7f6a0000 Vad PAGE_EXECUTE_READWRITE 6 0x7f6a0000 c8 00 00 00 58 01 00 00 ff ee ff ee 08 70 00 00 ....X........p.. 0x7f6a0000 c8000000 ENTER 0x0, 0x0
0x7f6a0010 08 00 00 00 00 fe 00 00 00 00 10 00 00 20 00 00 ................ 0x7f6a0004 58 POP EAX
0x7f6a0020 00 02 00 00 00 20 00 00 8d 01 00 00 ff ef fd 7f ................ 0x7f6a0005 0100 ADD [EAX], EAX
0x7f6a0030 03 00 08 06 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0x7f6a0007 00ff ADD BH, BH
matts2.exe 910 0x7f6a0000 Vad PAGE_EXECUTE_READWRITE 6 0x7f6a0000 c8 00 00 00 58 01 00 00 ff ee ff ee 08 70 00 00 ....X........p.. 0x7f6a0000 c8000000 ENTER 0x0, 0x0
0x7f6a0010 08 00 00 00 00 fe 00 00 00 00 10 00 00 20 00 00 ................ 0x7f6a0004 58 POP EAX
0x7f6a0020 00 02 00 00 00 20 00 00 8d 01 00 00 ff ef fd 7f ................ 0x7f6a0005 0100 ADD [EAX], EAX
0x7f6a0030 03 00 08 06 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0x7f6a0007 00ff ADD BH, BH
Currently I can read it in as a table but it places everything in a separate line. Every third blank line is what I am using as my delimiter but am still having problems with the shaping of the data. The hex and the assembly need to be a string format, i placed it in the table for brevity sake. Any help would be appreciated.
Upvotes: 0
Views: 51
Reputation: 249434
You should do this in two passes. The first is to read_table(usecols=0)
to parse the first "word" in each line. Then use that series to figure out where the sections start and end, and call read_table(skiprows=X, nrows=Y)
once for each section (where a section is defined as a chunk with uniform formatting).
Upvotes: 1