Pandas read_csv adds unnecessary " " to each row

Question

I have a csv file

(I am showing the first three rows here)

HEIGHT,WEIGHT,AGE,GENDER,SMOKES,ALCOHOL,EXERCISE,TRT,PULSE1,PULSE2,YEAR
173,57,18,2,2,1,2,2,86,88,93
179,58,19,2,2,1,2,1,82,150,93

I am using pandas read_csv to read the file and put them into columns.

Here is my code:

import pandas as pd
import os
path='~/Desktop/pulse.csv'

path=os.path.expanduser(path)
my_data=pd.read_csv(path, index_col=False, header=None, quoting = 3, delimiter=',')
print my_data

The problem is the first and last columns have " before and after the values.

Additionally I can't get rid of the indexes.

It might be making some silly mistake but I thank you for your help in advance

jezrael · Accepted Answer

Final solution - use replace with converting to ints and for remove " from columns names use strip:

df = pd.read_csv('pulse.csv', quoting=3)

df = df.replace('"','', regex=True).astype(int)
df.columns = df.columns.str.strip('"')
print (df.head())

   HEIGHT  WEIGHT  AGE  GENDER  SMOKES  ALCOHOL  EXERCISE  TRT  PULSE1  \
0     173      57   18       2       2        1         2    2      86   
1     179      58   19       2       2        1         2    1      82   
2     167      62   18       2       2        1         1    1      96   
3     195      84   18       1       2        1         1    2      71   
4     173      64   18       2       2        1         3    2      90   

   PULSE2  YEAR  
0      88    93  
1     150    93  
2     176    93  
3      73    93  
4      88    93

index_col=False means force not read first column to index, but dataframe always need some index, so is added default - 0,1,2.... So here can be omit.

header=None should be removed because it force dont read first row (header of csv) to columns of DataFrame. Then also first row of data is header and numeric values are converted to strings.

delimiter=',' should be removed too, because it is same as sep=',' what is default parameter.

Pandas read_csv adds unnecessary " " to each row

Answers (2)

Related Questions

Pandas read_csv adds unnecessary &quot; &quot; to each row

Answers (2)

Related Questions

Pandas read_csv adds unnecessary " " to each row