How do I load a text file into a pandas dataframe?

Question

I have a text file which looks something like this:

`

 101   the   323
 103   to    324
 104   is    325

where the delimiter is four spaces. I am trying read_csv function inorder to convert it into a pandas data frame.

data= pd.read_csv('file.txt', sep=" ", header = None)

However it is giving me lot of NaN values

    101	the	the	10115  NaN  NaN     NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
     102	to	to	5491  NaN  NaN     NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
     103	of	of	4767  NaN  NaN     NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
       104	a	a	4532  NaN  NaN     NaN  NaN  NaN  NaN  NaN  NaN  NaN  Na

Is there any way I can read the text file into a correct csv format.

jezrael · Accepted Answer

If need separator exactly 4 whitespaces:

data = pd.read_csv('file.txt', sep="\s{4}", header = None, engine='python')
print (data)
     0    1    2
0  101  the  323
1  103   to  324
2  104   is  325

Or use parameter delim_whitespace=True (thanks carthurs) or \s+ if need separator one or more whitespaces:

data = pd.read_csv('file.txt', sep="\s+", header = None)
data = pd.read_csv('file.txt', delim_whitespace=True, header = None)

But if separator is tab:

data = pd.read_csv('file.txt', sep="	", header = None)

How do I load a text file into a pandas dataframe?

Answers (2)

Related Questions