Reputation: 195
I'm trying to run my script on several .csv files and output the results from each file. A snippet of my code is as follows-
import sys
import os
import logging
import subprocess
import argparse
import pandas as pd
import glob
files = glob.glob('/scratch/*/*.csv')
for file in files:
df = pd.read_csv(file,delimiter = ',',skiprows=range(1,11))
#do some calculation on each file
#calculate the final value
metric = (max(max(dif_r1a),max(dif_r1c),max(dif_r1g),max(dif_r1t),max(dif_r2a),max(dif_r2c),max(dif_r2g),max(dif_r2t)))
#output the final value for each csv file
print(os.path.basename(file) + ' ' + str(metric))
The output I get is only for a single csv file
file1.csv 0.25
How do I iterate this to output the value for all the csv files ?
Thank you
Upvotes: 0
Views: 53
Reputation: 515
From what it appears like in your code above you create a dataframe for each .csv file, but only calculate the final value and print after the for loop executes. If you were to want to do it for each dataframe, these would need to be in the for loop:
import sys
import os
import logging
import subprocess
import argparse
import pandas as pd
import glob
files = glob.glob('/scratch/*/*.csv')
for file in files:
df = pd.read_csv(file,delimiter = ',',skiprows=range(1,11))
#do some calculation on each file
#calculate the final value
metric = (max(max(dif_r1a),max(dif_r1c),max(dif_r1g),max(dif_r1t),max(dif_r2a),max(dif_r2c),max(dif_r2g),max(dif_r2t)))
#output the final value for each csv file
print(os.path.basename(file) + ' ' + str(metric))
This is what you have at the moment, but you would want to change it to:
import sys
import os
import logging
import subprocess
import argparse
import pandas as pd
import glob
files = glob.glob('/scratch/*/*.csv')
for file in files:
df = pd.read_csv(file,delimiter = ',',skiprows=range(1,11))
#do some calculation on each file
#calculate the final value
metric =
(max(max(dif_r1a),max(dif_r1c),max(dif_r1g),max(dif_r1t), \
max(dif_r2a),max(dif_r2c),max(dif_r2g),max(dif_r2t)))
#output the final value for each csv file
print(os.path.basename(file) + ' ' + str(metric))
However this could also be due to formatting on the comment.
Upvotes: 1