Deo
Deo

Reputation: 11

Code returning :.2f as invalid for a str, not sure where I'm missing the Float conversion

Working on a code for Python, where essentially I am taking a csv file and depending on user input, I can print/return the connected information within the csv file. Essentially I'm slowly but surely de-bugging it but I'm stuck most likely due to frustration and would love some help catching what's wrong. So been trying to hunt where I should convert to float, if you see it. I'd love any help to see it.

My code:

import csv
import sys

from csv import reader

def read_csv(file):
    with open(file, 'r') as read_obj:
        csv_reader = list(csv.reader(read_obj, delimiter= ','))
        return csv_reader
  
def filter_data(user_input, data):
    filtered_data = []
    for row in data:
        if (row[0].lower() == user_input.lower()):
            filtered_data.append(row)
        elif (row[4].lower() == user_input.lower()):
                filtered_data.append(row)
    return filtered_data

def calc_averages(filtered_data):
    lst = [0, 0, 0]
    for row in filtered_data:
        lst[0] += float(row[7])
        lst[1] += float(row[8])
        lst[2] += float(row[9])
    lst[0] = (lst[0]/len(filtered_data))
    lst[1] = (lst[1]/len(filtered_data))
    lst[2] = (lst[2]/len(filtered_data))
    return lst

def calc_minimums(filtered_data):
    min_concentration = filtered_data[0][8]
    min_longevity = filtered_data[0][9]
    min_paralyzed = filtered_data[0][11]
    for line in filtered_data:
        if line[8] < min_concentration:
            min_concentration = line[8]
        if line[9] < min_longevity:
            min_longevity = line[9]
        if line[11] < min_paralyzed:
            min_paralyzed = line[11]
    return [min_concentration, min_longevity, min_paralyzed]

def calc_maximums(filtered_data):
    max_concentration = filtered_data[0][8]
    max_longevity = filtered_data[0][9]
    max_paralyzed = filtered_data[0][11]
    for line in filtered_data:
        if line[8] > max_concentration:
            max_concentration = line[8]
        if line[9] > max_longevity:
            max_longevity = line[9]
        if line[11] > max_paralyzed:
            max_paralyzed = line[11]
    return [max_concentration, max_longevity, max_paralyzed]

def print_stats(user_input, stat_type, stats):
    print (f'{stat_type}s for {user_input} bees:')
    print (f'{stat_type}s for {user_input} bees:')
    print (f'{stat_type} Imidacloprid Concentration: {stats[0]:.2f}')
    print (f'{stat_type} Days Paralyzed: {stats[2]:.2f}\n')

def run(data):
    user_input = input('Enter the species/genus or the sociality of bee you would like information about: ')
    filtered_data = filter_data(user_input, data)
    averages = calc_averages(filtered_data)
    minimums = calc_minimums(filtered_data)
    maximums = calc_maximums(filtered_data)
    print_stats(user_input, 'Average', averages)
    print_stats(user_input, 'Minimum', minimums)
    print_stats(user_input, 'Maximum', maximums)
    more_data = input('Would you like to see more data? (Y/N) ')
    if more_data == 'Y' or more_data == 'y':
        return True
    else:
        return False

def main():
    if len(sys.argv) != 2:
        print('Invalid arguments given')
        return
    data = read_csv(sys.argv[1])
    while run(data):
        continue
    
if __name__ == '__main__':
    main()

The error with input "solitary \n n":

Traceback (most recent call last):
  File "main.py", line 121, in <module>
    main()
  File "main.py", line 117, in main
    while run(data):
  File "main.py", line 104, in run
    print_stats(user_input, 'Minimum', minimums)
  File "main.py", line 88, in print_stats
    print (f'{stat_type} Imidacloprid Concentration: {stats[0]:.2f}')
ValueError: Unknown format code 'f' for object of type 'str'

The output is currently:

Enter the species/genus or the sociality of bee you would like information about: Averages for solitary bees:
Averages for solitary bees:
Average Imidacloprid Concentration: 29.48
Average Days Paralyzed: 1.33

Minimums for solitary bees:
Minimums for solitary bees:

Upvotes: 1

Views: 87

Answers (1)

ShadowRanger
ShadowRanger

Reputation: 155458

Both calc_minimums and calc_maximums are copying string data directly from the provided filtered_data argument without performing any type conversions. It's string data because the filter_data function is not performing any type conversions itself, and since you're not reading from the file with the csv.QUOTE_NONNUMERIC flag operating on the reader, no type conversion is performed (all the fields in each row are str).

It's possible you think you've performed the conversions because you called float on the fields in calc_averages, but that only read from the fields, parsed them, and returned a new float, leaving the original list unchanged (the values are still str).

If you want this to work, do one of:

  1. Change calc_minimums and calc_maximums to perform the same type conversions that calc_averages is doing (kind of important, because right now your mins and maxes are based on str comparisons, not float comparisons, and string lexicographic sorting is almost certainly wrong much of the time, unless all your values have the exact same number of digits)
  2. Apply a conversion step to convert the necessary fields to float ahead of time, before the data is used (saves repeated conversion work)
  3. (If the CSV format supports it) Add the csv.QUOTE_NONNUMERIC to your csv.reader creation (assumes all non-numeric fields are in fact quoted, which is unlikely, unless the file was created using the same flag, or comes from an unusual CSV generator)

Assuming #3 isn't an option, the easiest solution (as well as fastest, and least error-prone) is #2; your existing functions don't have to change (the float conversions could even be removed from calc_averages). Just make a new function, e.g.:

# Defaulted arguments match the type and indices your code uses
def convert_data(data, totype=float, convert_indices=(7, 8, 9, 11)):
    '''Modifies data in-place, converting specified indices in each row to totype'''
    for row in data:
        for idx in convert_indices:
            row[idx] = totype(row[idx])

Then just add:

convert_data(filtered_data)

immediately after the line filtered_data = filter_data(user_input, data).

If you prefer not operating in-place, you can make a new list with something like this:

# Defaulted arguments match the type and indices your code uses
def convert_data(data, totype=float, convert_indices=frozenset({7, 8, 9, 11})):
    '''Returns new copy of data with specified indices in each row converted to totype'''
    convert_indices = frozenset(convert_indices)  # Optimize to reduce cost of
                                                  # checking if index should be converted
    return [[totype(x) if i in convert_indices else x for i, x in enumerate(row)]
            for row in data]

and make the inserted line:

filtered_data = convert_data(filtered_data)

Upvotes: 1

Related Questions