Reputation: 217
I am pulling in a text file that has a lot of different data: Serial Num, Type, and a log of csv data:
A123>
A123>read sn
sn = 12143
A123>read cms-sn
cms-sn = 12143-00000000-0000
A123>read fw-rev
fw-rev = 1.3, 1.3
A123>read log
log =
1855228,1,0,41,-57,26183,25,22,21,22,0,0,0,89,2048,500,0,0
1855240,1,0,33,0,26319,25,22,22,23,0,0,0,89,2048,500,0,0
2612010,1,0,41,-82,26122,20,21,21,21,0,0,0,87,2048,500,0,0
2612142,1,0,49,301,27607,21,22,21,21,0,0,0,81,2048,500,0,0
Here is the code I have so far:
import pandas as pd
lines = [] # Declare an empty list named "lines"
with open ('03-22-2019.txt', 'rt') as in_file: # Open file
for line in in_file: # For each line of text in in_file, where the data is named "line",
lines.append(line.rstrip('\n')) # add that line to our list of lines, stripping newlines.
while('' in lines):
lines.remove("")
lines = [x for x in lines if 'A123' not in x] #delete all lines with 'A123'
for element in lines: # For each element in our list,
print(element) # print it.
split_line = lines[0].split() # create list with serial number line
Serial_Num = split_line[-1]
print(Serial_Num)
split_line = lines[1].split() # go to line with CMS SN
CMS_SN = split_line[-1]
print(CMS_SN)
split_line = lines[2].split()
Firm_Rev_1 = split_line[-1]
Firm_Rev_2 = split_line[-2]
print(Firm_Rev_1)
print(Firm_Rev_2)
# Problem section starts here!
start_data = lines.index("log =") + 1 #<<<<<<<<<<
data = [x for x in lines[start_data:].split(",")] #<<<<<<<<<<
#dfObj = pd.DataFrame(lines[start_data:-1].split(",")) #<<<<<<<<<<
The problem come up when I am trying to import the log section of the data into a dataframe and split out the CSV values into their own column.
How do I programmatically find the start of the log data, and read the data from there to the end into a Pandas dataframe?
Upvotes: 0
Views: 109
Reputation: 2017
It looks like you're pretty close.
# this will get you a list of lists for each line.
data = [line.split(',') for line in lines[start_data:]]
# This should construct your data frame
dfObj = pd.DataFrame(data=data, columns=[list, of, column, names])
Upvotes: 1