Import csv using genfromtxt() and converters

Question

import numpy as np
import csv
filename = "a.csv"

def convert(s): 
    s = s.strip().replace(',', '.')
    return str(s)

salary_data = np.genfromtxt(filename,
                     delimiter= ',',
                     dtype=[('year','i8'),('university','U50'),('school','U250'), 
                     ('degree','U250'),('employement_rate_overall','f8'), 
                     ('basic_monthly_mean','f8'),('gross_monthly_mean','i8'), 
                     ('gross_monthly_median','i8'),('gross_mthly_25_percentile','i8'), 
                     ('gross_mthly_75_percentile','i8')], 
                     encoding= None, #avoid having the deprecated warning
                     skip_header=1,
                     missing_values=['na','-'],filling_values=[0],
                     converters={2: convert} ,
                     comments=None)
print(salary_data)

I was trying to load the csv data, but the data is quite dirty as it contains quotation marks/commas inside the some of the value field and causes me an error.

      Some errors were detected!
      Line #5 (got 13 columns instead of 12)

I was trying to clean the commas by using the converters. However, the code doesn't seem to work. and I tried

      converters={2: lambda s: str(s.replace(',', '.'))}

This is also not working for my cases. I hope to know what is my mistake and thanks for helping! Thank you for those spotting out my mistake! Even I tried to replace the quotation marks the code is not functioning. The text below is the csv file that I am loading.

      year,university,school,degree,employment_rate_overall,employment_rate_ft_perm,basic_monthly_mean,basic_monthly_median,gross_monthly_mean,gross_monthly_median,gross_mthly_25_percentile,gross_mthly_75_percentile
     2013,Nanyang Technological University,College of Business (Nanyang Business School),Accountancy and Business,97.4,96.1,3701,3200,3727,3350,2900,4000
     2013,Nanyang Technological University,College of Business (Nanyang Business 
     School),Accountancy (3-yr direct Honours Programme),97.1,95.7,2850,2700,2938,2700,2700,2900
     2013,Nanyang Technological University,College of Business (Nanyang Business 
     School),Business (3-yr direct Honours Programme),90.9,85.7,3053,3000,3214,3000,2700,3500
     2013,Nanyang Technological University,"College of Humanities, Arts & Social 
     Sciences",Economics,89.9,83.5,3085,3000,3148,3000,2800,3545
     2013,Nanyang Technological University,College of Sciences,Biomedical Sciences 
     **,na,na,na,na,na,na,na,na
     2013,Nanyang Technological University,College of Sciences,Biomedical Sciences 
     (Traditional Chinese Medicine) #,90.7,88.4,2840,2800,2883,2807,2700,3000
     2013,Nanyang Technological University,College of Sciences,Mathematics & Economics 
     **,na,na,na,na,na,na,na,na
     2014,Nanyang Technological University,"College of Humanities, Arts & Social 
     Sciences","Art, Design & Media",80,68,2761,2600,2791,2700,2300,3000

Import csv using genfromtxt() and converters

Answers (1)

Related Questions