Reputation: 13
I have a file that contains 9 columns and 14399 rows. I want, using Python3.3, to subtract first number in 4th column from the 99th on the same column, then subtract the 100th from the 199th and so one...all in the same column and save the corresponding numbers of the first two columns and the answer of subtracting in a new CSV file. Below are table example and my code attempt (I searched through the questions but didn't find any).
Input file named test.CAR
1/12/2009 00:00:05 01 34.51 1234.43 54.3 4321.5 55.4 3241.4
1/12/2009 00:00:10 02 34.51 1234.40 54.3 4321.52 55.4 3241.4
....
....
1/12/2009 00:10:05 99 36.51 4244.40 64.3 4421.52 85.4 4241.4
..
..
1/12/2009 00:20:10 100 44.51 1234.40 54.3 4321.52 55.4 3241.4
..
..
1/12/2009 00:30:10 199 54.51 1234.40 54.3 4321.52 55.4 3241.4
..
..
Output file named test.csv
1/12/2009 00:00:05 2.00 from (36.51-34.51)
1/12/2009 00:20:10 10.00 from (54.51-44.51)
..
..
Here is my code so far:
import csv
import math
test_filename='C:/Python33/test.CAR'
test_filename=open(test_filename,'r')
num_lines=sum(1 for line in open('test.CAR'))
with open('test.csv','w',newline='')as fp:
w=csv.writer(fp,delimiter=',')
atad=[['DATE','TIME','NUMBER']]
w.writerows(atad)
a=0 #to set the first row
d=98 ## to set the 99th row
for i in range (1,(num_lines+1)):
b=test_filename.readline()
date=(b[0:10]) ## to capture the date in 1st column
time=(b[12:19]) ## to capture the time in 2nd column
y=b[24:30] ## to capture the number I want in 4th column
number=y(d)-y(a) ## to subtract the specific number on 1st from 99th column
data=[[date,time,number]]
w.writerows(data)
a=a+98 ## counter to change 1st number to the 100th and so on
d=d+98 ## counter to change 99th number to the 199th and so on
test_filename.close()
The code is not working and I appreciate your help. Thanks!
Upvotes: 1
Views: 1637
Reputation: 1285
Without commenting/changing any of the style choices, the reason your program doesn't work is that you access values in the list y
that are not yet filled in.
Additionally, you read the numbers from the file, but at that point they are still saved as a string.
import csv
import math
test_filename='C:/Python33/test.CAR'
test_filename=open(test_filename,'r')
num_lines=sum(1 for line in open('test.CAR'))
with open('test.csv','w',newline='')as fp:
w=csv.writer(fp,delimiter=',')
atad=[['DATE','TIME','NUMBER']]
w.writerows(atad)
a=0 #to set the first row
d=98 ## to set the 99th row
for i in range (1,(num_lines+1)):
b=test_filename.readline()
date=(b[0:10]) ## to capture the date in 1st column
time=(b[12:19]) ## to capture the time in 2nd column
y=float(b[24:30]) ## to capture the number I want in 4th column
while d < len(y)-1:
number=y(d)-y(a) ## to subtract the specific number on 1st from 99th column
data=[[date,time,number]]
w.writerows(data)
a=a+98 ## counter to change 1st number to the 100th and so on
d=d+98 ## counter to change 99th number to the 199th and so on
test_filename.close()
Assuming the rest of the code works as intended (I am not at all sure it does) the introduction of y=float(b[24:30])
and the while
loop should solve some of your problems.
Upvotes: 0
Reputation: 1121436
Treat your file as an iterable and it becomes easy to skip rows; we can use itertools.islice()
to skip rows we don't need:
from itertools import islice
import csv
test_filename = 'C:/Python33/test.CAR'
with open(test_filename, 'r') as infh, open('test.csv', 'w' ,newline='') as outfh:
writer = csv.writer(outfh)
writer.writerow(['DATE', 'TIME', 'NUMBER'])
for line in infh:
date1, time1, _, num1, _ = line.split(None, 4)
num1 = float(num1)
# skip 98 lines to read line number 99 from where we are now
nextline = next(islice(infh, 98, 99), None)
if nextline is None:
break # file is done early
date2, time2, _, num2, _ = nextline.split(None, 4)
num2 = float(num2)
writer.writerow([date1, time1, num2 - num1])
This also uses float()
to turn your 4th column into a floating point value. It also uses writer.writerow()
(singular) instead of writer.writerows()
(plural) as we are only writing one row at a time here.
Upvotes: 2