Reputation: 16992
In the below example contents of the file seems to become empty after performing copy_expert command and pd.read_csv is throwing an error stating it is an empty file. The file seems to be empty even when I performing this operation in reverse order (read_csv before copy_expert). Why is file empty and how can I handle this scenario?
file = //download file from S3
copy_sql = """
COPY %s FROM stdin WITH CSV HEADER
DELIMITER as ','
"""
cursor = self.connection.cursor()
cursor.copy_expert(sql=copy_sql % table_name, file=file)
cursor.close()
df = pd.read_csv(file, dtype={// dtype value})
EDIT
I was able to solve it by performing following , however it would be helpful to understand why the file object is emptied and also if there is a more efficient way of doing this.
file = //download file from S3
file_clone = copy.deepcopy(file)
copy_sql = """
COPY %s FROM stdin WITH CSV HEADER
DELIMITER as ','
"""
cursor = self.connection.cursor()
cursor.copy_expert(sql=copy_sql % table_name, file=file)
cursor.close()
df = pd.read_csv(file_clone, dtype={// dtype value})
Upvotes: 1
Views: 189
Reputation: 2789
The issue is that after first reading the file data, the file pointer
will be at the end of the file; the subsequent read call will return no data.
In order to be able to read the data again you need to move the pointer to the beginning of the file:
cursor.copy_expert(sql=copy_sql % table_name, file=file)
file.seek(0)
df = pd.read_csv(file, dtype={// dtype value})
Upvotes: 1