Reputation: 11
Working on a code for Python, where essentially I am taking a csv file and depending on user input, I can print/return the connected information within the csv file. Essentially I'm slowly but surely de-bugging it but I'm stuck most likely due to frustration and would love some help catching what's wrong. So been trying to hunt where I should convert to float, if you see it. I'd love any help to see it.
My code:
import csv
import sys
from csv import reader
def read_csv(file):
with open(file, 'r') as read_obj:
csv_reader = list(csv.reader(read_obj, delimiter= ','))
return csv_reader
def filter_data(user_input, data):
filtered_data = []
for row in data:
if (row[0].lower() == user_input.lower()):
filtered_data.append(row)
elif (row[4].lower() == user_input.lower()):
filtered_data.append(row)
return filtered_data
def calc_averages(filtered_data):
lst = [0, 0, 0]
for row in filtered_data:
lst[0] += float(row[7])
lst[1] += float(row[8])
lst[2] += float(row[9])
lst[0] = (lst[0]/len(filtered_data))
lst[1] = (lst[1]/len(filtered_data))
lst[2] = (lst[2]/len(filtered_data))
return lst
def calc_minimums(filtered_data):
min_concentration = filtered_data[0][8]
min_longevity = filtered_data[0][9]
min_paralyzed = filtered_data[0][11]
for line in filtered_data:
if line[8] < min_concentration:
min_concentration = line[8]
if line[9] < min_longevity:
min_longevity = line[9]
if line[11] < min_paralyzed:
min_paralyzed = line[11]
return [min_concentration, min_longevity, min_paralyzed]
def calc_maximums(filtered_data):
max_concentration = filtered_data[0][8]
max_longevity = filtered_data[0][9]
max_paralyzed = filtered_data[0][11]
for line in filtered_data:
if line[8] > max_concentration:
max_concentration = line[8]
if line[9] > max_longevity:
max_longevity = line[9]
if line[11] > max_paralyzed:
max_paralyzed = line[11]
return [max_concentration, max_longevity, max_paralyzed]
def print_stats(user_input, stat_type, stats):
print (f'{stat_type}s for {user_input} bees:')
print (f'{stat_type}s for {user_input} bees:')
print (f'{stat_type} Imidacloprid Concentration: {stats[0]:.2f}')
print (f'{stat_type} Days Paralyzed: {stats[2]:.2f}\n')
def run(data):
user_input = input('Enter the species/genus or the sociality of bee you would like information about: ')
filtered_data = filter_data(user_input, data)
averages = calc_averages(filtered_data)
minimums = calc_minimums(filtered_data)
maximums = calc_maximums(filtered_data)
print_stats(user_input, 'Average', averages)
print_stats(user_input, 'Minimum', minimums)
print_stats(user_input, 'Maximum', maximums)
more_data = input('Would you like to see more data? (Y/N) ')
if more_data == 'Y' or more_data == 'y':
return True
else:
return False
def main():
if len(sys.argv) != 2:
print('Invalid arguments given')
return
data = read_csv(sys.argv[1])
while run(data):
continue
if __name__ == '__main__':
main()
The error with input "solitary \n n":
Traceback (most recent call last):
File "main.py", line 121, in <module>
main()
File "main.py", line 117, in main
while run(data):
File "main.py", line 104, in run
print_stats(user_input, 'Minimum', minimums)
File "main.py", line 88, in print_stats
print (f'{stat_type} Imidacloprid Concentration: {stats[0]:.2f}')
ValueError: Unknown format code 'f' for object of type 'str'
The output is currently:
Enter the species/genus or the sociality of bee you would like information about: Averages for solitary bees:
Averages for solitary bees:
Average Imidacloprid Concentration: 29.48
Average Days Paralyzed: 1.33
Minimums for solitary bees:
Minimums for solitary bees:
Upvotes: 1
Views: 87
Reputation: 155458
Both calc_minimums
and calc_maximums
are copying string data directly from the provided filtered_data
argument without performing any type conversions. It's string data because the filter_data
function is not performing any type conversions itself, and since you're not reading from the file with the csv.QUOTE_NONNUMERIC
flag operating on the reader
, no type conversion is performed (all the fields in each row are str
).
It's possible you think you've performed the conversions because you called float
on the fields in calc_averages
, but that only read from the fields, parsed them, and returned a new float
, leaving the original list
unchanged (the values are still str
).
If you want this to work, do one of:
calc_minimums
and calc_maximums
to perform the same type conversions that calc_averages
is doing (kind of important, because right now your mins and maxes are based on str
comparisons, not float
comparisons, and string lexicographic sorting is almost certainly wrong much of the time, unless all your values have the exact same number of digits)float
ahead of time, before the data is used (saves repeated conversion work)csv.QUOTE_NONNUMERIC
to your csv.reader
creation (assumes all non-numeric fields are in fact quoted, which is unlikely, unless the file was created using the same flag, or comes from an unusual CSV generator)Assuming #3 isn't an option, the easiest solution (as well as fastest, and least error-prone) is #2; your existing functions don't have to change (the float
conversions could even be removed from calc_averages
). Just make a new function, e.g.:
# Defaulted arguments match the type and indices your code uses
def convert_data(data, totype=float, convert_indices=(7, 8, 9, 11)):
'''Modifies data in-place, converting specified indices in each row to totype'''
for row in data:
for idx in convert_indices:
row[idx] = totype(row[idx])
Then just add:
convert_data(filtered_data)
immediately after the line filtered_data = filter_data(user_input, data)
.
If you prefer not operating in-place, you can make a new list
with something like this:
# Defaulted arguments match the type and indices your code uses
def convert_data(data, totype=float, convert_indices=frozenset({7, 8, 9, 11})):
'''Returns new copy of data with specified indices in each row converted to totype'''
convert_indices = frozenset(convert_indices) # Optimize to reduce cost of
# checking if index should be converted
return [[totype(x) if i in convert_indices else x for i, x in enumerate(row)]
for row in data]
and make the inserted line:
filtered_data = convert_data(filtered_data)
Upvotes: 1