Mechanician
Mechanician

Reputation: 545

Nan issue with pandas.read_csv

I am trying to read a data file with a header. The data file is attached and I am using the following code:

import pandas as pd
data=pd.read_csv('TestData.out', sep=' ', skiprows=1, header=None)

The issue is that I have 20 columns in my data file, while I am getting 32 columns in the variable data. How can I resolve this issue. I am very new to Python and I am learning.

Data_File

Upvotes: 0

Views: 70

Answers (2)

SunilG
SunilG

Reputation: 345

Your data file has inconsistent space delimitation. So, you just have to skip the subsequent space after the delimiter. This simple code works:

data= pd.read_csv('TestData.out',sep=' ',skiprows=1,skipinitialspace=True)

Upvotes: 0

NateB
NateB

Reputation: 509

Your Text File has two spaces together, in from of any value that does not have a minus sign. if sep=' ', pandas sees this as two delimiters with nothing (Nan) inbetween.

This will fix it:

data = pd.read_csv('TestData.out', sep='\s+', skiprows=1, header=None)

In this case the sep is interpreted as a regex, which looks for "one of more spaces" as the delimiter, and reurns Columns 0 though 19.

Upvotes: 1

Related Questions