João Matos
João Matos

Reputation: 6950

Python pandas to_csv causes OSError: [Errno 22] Invalid argument

My code is the following:

import pandas as pd
import numpy as np

df = pd.read_csv("path/to/my/infile.csv")
df = df.sort_values(['distance', 'time'])
df.to_csv("path/to/my/outfile.csv")

This code reads from infile.csv which is a 3GB csv file successfully, sorts it and fails when trying to write to outfile.csv with the following error:

OSError                                   Traceback (most recent call last)
<ipython-input-10-3a5c8279658d> in <module>
----> 1 df.to_csv('/Users/joaomatos/Desktop/cluster22_sorted_training.csv',index=False)

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/frame.py in to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, line_terminator, chunksize, tupleize_cols, date_format, doublequote, escapechar, decimal)
   1743                                  doublequote=doublequote,
   1744                                  escapechar=escapechar, decimal=decimal)
-> 1745         formatter.save()
   1746 
   1747         if path_or_buf is None:

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/io/formats/csvs.py in save(self)
    164                                          encoding=encoding,
    165                                          compression=self.compression)
--> 166                 f.write(buf)
    167                 f.close()
    168                 for _fh in handles:

OSError: [Errno 22] Invalid argument

My question is why?

Upvotes: 14

Views: 39729

Answers (8)

Dinesh vishe
Dinesh vishe

Reputation: 3598

import pandas as pd
print(pd.__file__)  ----read directory of pands
loan_data=pd.read_csv("D:\Lib\site-packages\pandas\email-password.csv")
loan_data

enter image description here

Upvotes: 0

Everton Silva
Everton Silva

Reputation: 1

Just remove any file in "path/to/my/" or try:

if not files_present:
    pd.to_csv(filename)
else:
    print 'WARNING: This file already exists!'

Upvotes: 0

Alex Pasquali
Alex Pasquali

Reputation: 78

One of the answer mentions to substitute \ with /. This works with me as well, but it can be tedious. An alternative is to put an r in front of the string (like this: r"my\string\"), which tells the interpreter to treat the backslashes (\) as raw characters, instead of combining them with the next char to form a special character (e.g., \n, \t).

Upvotes: 1

David Ebert
David Ebert

Reputation: 111

I resolved this by removing unusual characters contained in the timestamp that made up part of the filename.

Not working:

file_timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
df.to_csv(file_name + '_' + file_timestamp + '.csv')

Working:

file_timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
df.to_csv(file_name + '_' + file_timestamp + '.csv')

Upvotes: 1

Vivian Ge
Vivian Ge

Reputation: 159

I just had a similar issue and I was using back slash \ which usually works in the past but this time turn out I had to use / instead, which is extremely weird but it worked.

Upvotes: 15

kjohnsen
kjohnsen

Reputation: 370

In my case (working on an external hard drive) it worked once I specified the absolute, rather than the relative, path.

Upvotes: 1

Nicol&#225;s M.
Nicol&#225;s M.

Reputation: 311

After exploring a lot of options, including the pandas library update to the latest version (1.2.4 as of today), changing the engine to "python" or "c", debugging, etc. I finally discovered what the issue was:

I had my CSV files stored in a folder that was constantly being synchronized in real-time with OneDrive.

YES! I discovered that the tray icon was becoming crazy and OneDrive was consuming resources at the same time I was doing algorithmic trading backtesting to my pet project. I paused sync and then it never failed again!!

I guess you can also exclude the folder from OneDrive or simply change the location where the CSVs are stored/written/accessed.

Upvotes: 21

Jo&#227;o Matos
Jo&#227;o Matos

Reputation: 6950

Apparently this problem is caused by a known bug reported here associated with a previous version of pandas. All I had to do was pip3 install --upgrade pandas and then restart the computer.

Upvotes: 4

Related Questions