Reputation: 1479
I want to create a matrix according to the table from CSV data
COEFFICIENT MATRIX
,0,1,2,3,4
0,0.00876623398408,0.525189723661,0.528495953628,0.94228622319,0.0379073884588
1,0.434693398364,0.77017930965,0.00847865052462,0.544319471939,0.858970329817
2,0.978091233581,0.900800004769,0.504567295427,0.65499490009,0.397203736755
3,0.671510258373,0.554713361673,0.377098128478,0.246977226206,0.535900353082
...
5000,0.791781572037,0.70262685963,0.218775600741,0.19802280762,0.68177855465
I'm using pandas for reading csv and return a matrix. Instead of getting matrix.shape = 5001*5, I got 5002*1.
How to make pandas dataframe separate the right number of columns according to comma from CSV, and don't count the header (after the table title) as the first row?
input = pd.read_csv(coeff_file, skiprows=0)
input_mat = input.as_matrix()
print input.shape
print type(input)
print input_mat.shape
print type(input_mat)
return
(5002, 1)
<class 'pandas.core.frame.DataFrame'>
(5002, 1)
<type 'numpy.ndarray'>
Upvotes: 1
Views: 6116
Reputation: 862511
I think you need skiprows=1
, skiprows=[0]
or header=1
parameters in read_csv
:
df = pd.read_csv(coeff_file, skiprows=1, index_col=0)
print (df)
0 1 2 3 4
0 0.008766 0.525190 0.528496 0.942286 0.037907
1 0.434693 0.770179 0.008479 0.544319 0.858970
2 0.978091 0.900800 0.504567 0.654995 0.397204
3 0.671510 0.554713 0.377098 0.246977 0.535900
5000 0.791782 0.702627 0.218776 0.198023 0.681779
df = pd.read_csv(coeff_file, header=1, index_col=0)
print (df)
0 1 2 3 4
0 0.008766 0.525190 0.528496 0.942286 0.037907
1 0.434693 0.770179 0.008479 0.544319 0.858970
2 0.978091 0.900800 0.504567 0.654995 0.397204
3 0.671510 0.554713 0.377098 0.246977 0.535900
5000 0.791782 0.702627 0.218776 0.198023 0.681779
df = pd.read_csv(StringIO(temp), skiprows=[0], index_col=0)
print (df)
0 1 2 3 4
0 0.008766 0.525190 0.528496 0.942286 0.037907
1 0.434693 0.770179 0.008479 0.544319 0.858970
2 0.978091 0.900800 0.504567 0.654995 0.397204
3 0.671510 0.554713 0.377098 0.246977 0.535900
5000 0.791782 0.702627 0.218776 0.198023 0.681779
Upvotes: 1