Arthur D. Howland
Arthur D. Howland

Reputation: 4557

Python 3 Pandas write to CSV format column as string

having an issue with handling oddly formatted excel data and writing to CSV in a string format. In my sample data, the excel table I am importing has a column ('Item_Number') and the odd data in the cell looks like: ="0001", ="00201", 2002AA, 1003B.

When I try to output to csv, the results look like: 1, 201, 2002AA, 1003B. When I try to output to excel, the results are correct: 0001, 00201, 2002AA, 1003B.

All of the dtypes are objects. Am I missing a parameter in my .to_csv() command?

df = pd.read_excel(filename,sheetname='Sheet1', converters= {'Item_Number':str})
df.to_csv('Test_csv.csv')
df.to_excel('Test_excel.xlsx')

Tried different iterations of replacing the "=" and " " " but no response.

df.Item_Number.str.replace('=','')

Currently using the excel output but curious if there is a way to preserve string formatting in CSV. Thanks :)

Upvotes: 3

Views: 8289

Answers (1)

Arthur D. Howland
Arthur D. Howland

Reputation: 4557

Opening an excel spreadsheet with Python 3 Pandas that has data that looks like ="0001" will go to the dataframe correctly. CSV will turn it back to "1". Keeping the same format to CSV is apparently a known issue (from my comment above). To keep the formatting I have to add =" " back into the data like this:

df['Item_Number'] = '="' + df['Item_Number'] + '"'

Not sure if there is a cleaner version to that will have an Excel opened CSV file show 0001 without the quotes and equals sign.

Upvotes: 3

Related Questions