pm1359
pm1359

Reputation: 632

Pandas read_csv doesn't parse correctly csv file

I am trying to read csv file and then convert it to data frame but i don't know why all the columns are shown in the first row and even with separator or delimiter either without them I am not able to separate them. I don't know how to change code in order to have correct result? Here is one line of file

1330-5235-5560-xxxxx,"Jan 1, 2017",12:35:13 AM PST,,Charge,,Smart Plan (Calling & Texting),com.xxx,1,unlimited_usca_tariff_and,astar-y3,US,NC,27288,USD,4.99,0.950333,EUR,9.49

enter image description here

Upvotes: 1

Views: 3297

Answers (1)

jezrael
jezrael

Reputation: 862406

You need set quoting to QUOTE_NONE in read_csv:

import csv

df = pd.read_csv('sample.csv', quoting=csv.QUOTE_NONE)

#sum some columns 
df['Transaction Date'] = df['Description'] + df['Transaction Date']
#create column from index
df['Description'] = df.index

#remove " from values
df['Description'] = df['Description'].str.strip('"')
df['Transaction Date'] = df['Transaction Date'].str.strip('"')
df['Amount (Merchant Currency)'] = df['Amount (Merchant Currency)'].str.strip('"')
                                                                   .astype(float)

df = df.reset_index(drop=True)
print (df.head(1))


            Description Transaction Date Transaction Time  Tax Type  \
0  8330-5235-5560-88882       Jan 8 2084  82:35:83 AM PST       NaN   

  Transaction Type  Refund Type                    Product Title Product id  \
0           Charge          NaN  Smart Plan ( Calling & Texting)  com.fight   

   Product Type              Sku Id  Hardware Buyer Country Buyer State  \
0             8  unlimited_usca_and  astar-y3            US          NC   

  Buyer Postal Code Buyer Currency  Amount (Buyer Currency)  \
0             24288            USD                     9.99   

   Currency Conversion Rate Merchant Currency  Amount (Merchant Currency)  
0                   0.95028               EUR                        9.49  

Upvotes: 3

Related Questions