Ying G.
Ying G.

Reputation: 31

Read tab-delimited fields with pandas when some lines have more than one tab

I am trying to read a tab-separated text file using pandas. The file looks like this:

data file sample

14.38   14.21   0.8951  5.386   3.312   2.462   4.956   1
14.69   14.49   0.8799  5.563   3.259   3.586   5.219   1
14.11   14.12   0.8911  5.422   3.302   2.723           5       1

Some lines have extra tabs. If I use read_csv or read_fwf and specify sep='\t'. I get results that look like this:

0   15.26\t14.84\t0.871\t5.763\t3.312\t2.221\t5.22\t1
1   14.88\t14.57\t0.8811\t5.554\t3.333\t1.018\t4.9

Do you have any suggestions as to what parameters I could specify to deal with this problem?

Solution:

use pd.read_csv(filename, delim_whitespace=True)

Upvotes: 3

Views: 6797

Answers (2)

Vaishali
Vaishali

Reputation: 38415

Pandas read_csv is very versatile, you can use it with delim_whitespace = True to handle variable number of whitespaces.

df = pd.read_csv(filename, delim_whitespace=True)

Option 2: Use separator argument

df = pd.read_csv(filename, sep='\t+')

Upvotes: 7

Rorschach
Rorschach

Reputation: 3802

If I use this code:

import pandas as pd
parsed_csv_txt = pd.read_csv("tabbed.txt",sep="\t")
print(parsed_csv_txt)

On this file:

a   b   c   d   e
14.69   2452    982 234 12
14.11   5435    234     12
16.63   1       12  66

I get:

       a     b      c      d   e
0  14.69  2452  982.0  234.0  12
1  14.11  5435  234.0    NaN  12
2  16.63     1    NaN   12.0  66

Are there any issues with the output that we see here?

If you would like a different output such as:

       a     b    c    d     e
0  14.69  2452  982  234  12.0
1  14.11  5435  234   12   NaN
2  16.63     1   12   66   NaN

Use this code:

import pandas as pd
parsed_csv_txt = pd.read_csv("tabbed.txt",delim_whitespace=True)
print(parsed_csv_txt)

Note

For a longer discussion around the topic of variable amounts of whitespace between values check out this discussion: Can pandas handle variable-length whitespace as column delimiters

Upvotes: 0

Related Questions