Frank
Frank

Reputation: 745

Python Pandas CSV "TypeError: <lambda>() takes exactly 1 argument (5 given)"

I have a CSV that is formatted:

"Year","Month","Day","Hour","Minute","Direct","Diffuse","D_Global","D_IR","U_Global","U_IR","Zenith"
2001,3,1,0,1,0.28,84.53,83.53,224.93,76.67,228.31,80.031
2001,3,1,0,2,0.15,84.24,83.25,224.76,76.54,228.62,80.059
2001,3,1,0,3,0.16,84.63,83.43,225.62,76.76,229.06,80.087
2001,3,1,0,4,0.20,85.20,83.99,226.56,77.15,228.96,80.115

And my script is:

df1 = pd.read_csv(input_file,
        sep = ",",
        parse_dates = {'Date': [0,1,2,3,4]},
        date_parser = lambda x: pd.to_datetime(x, format="%Y %m %d %H %M"),
        index_col = ['Date'])

The error I get is:

Traceback (most recent call last):
  File "convertCSVtoNC.py", line 70, in <module>
    openFile(sys.argv[1:])
  File "convertCSVtoNC.py", line 30, in openFile
    df2 = createDataFrame(input_file, counter)
  File "convertCSVtoNC.py", line 43, in createDataFrame
    index_col = ['Date'])
...
TypeError: <lambda>() takes exactly 1 argument (5 given)

The script runs fine for 302 previous inputs, an example is formatted:

"Year","Month","Day","Hour","Minute","Direct","Diffuse","D_Global","D_IR","U_Global","U_IR","Zenith"
1976,1,1,0,3,-999.00,-999.00,-999.00,-999.00,-999.00,-999.00,95.751
1976,1,1,0,6,-999.00,-999.00,-999.00,-999.00,-999.00,-999.00,95.839
1976,1,1,0,9,-999.00,-999.00,-999.00,-999.00,-999.00,-999.00,95.930
1976,1,1,0,12,-999.00,-999.00,-999.00,-999.00,-999.00,-999.00,96.023

Any ideas why?

Upvotes: 0

Views: 745

Answers (2)

Frank
Frank

Reputation: 745

Turns out there were a few new line characters at the end of my input csv file. I suppose this makes sense, as my lambda function was taking entire lines as input. Maybe something to look for in future lambda related questions.

Upvotes: 0

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210852

It works fine for me:

df1 = pd.read_csv(file_name, parse_dates={'Date':[0,1,2,3,4]},
                  date_parser=lambda x: pd.to_datetime(x, format='%Y %m %d %H %M'),
                  index_col=['Date']))


In [215]: df1
Out[215]:
                     Direct  Diffuse  D_Global    D_IR  U_Global    U_IR  Zenith
Date
2001-03-01 00:01:00    0.28    84.53     83.53  224.93     76.67  228.31  80.031
2001-03-01 00:02:00    0.15    84.24     83.25  224.76     76.54  228.62  80.059
2001-03-01 00:03:00    0.16    84.63     83.43  225.62     76.76  229.06  80.087
2001-03-01 00:04:00    0.20    85.20     83.99  226.56     77.15  228.96  80.115

PS i'm using Pandas 0.19.1

Upvotes: 1

Related Questions