Lew Leibowitz
Lew Leibowitz

Reputation: 55

How to configure Pandas to read .dat file

Can't seem to load this file with pd.read_csv or pd.read_table:

http://cdsarc.u-strasbg.fr/ftp/cats/J/A+A/594/A27/psz2.dat

I've tried:

psz2_2017 = pd.read_csv('http://cdsarc.u-strasbg.fr/ftp/cats/J/A+A/594/A27/psz2.dat', sep=';', header=None)

Also tried pd.read_table with/wo sep='\t' (same result) and sep='\s+' (gets Error tokenizing data).

Results in:

0
0   1 PSZ2 G000.04+45.13 0.0405432 45.135175...
1   2 PSZ2 G000.13+78.04 0.1380577 78.042113...
2   3 PSZ2 G000.40-41.86 0.4029953 -41.860792...

Any suggestions?

The first line looks like this:

1 PSZ2 G000.04+45.13   0.0405432  45.1351750 229.1905120  -1.0172220  4.107310  6.75319 2 111 0     1 0 0.938825   5.481591  1.899500  20 RXC J1516.5-0056           0.119800  3.962411 0.393290 0.370242 J1516.5-0056 RMJ151653.9-010506.3

Upvotes: 2

Views: 4914

Answers (2)

Luis Miguel
Luis Miguel

Reputation: 5127

Your data has irregular columns, and pandas is trying to figure out the correct number by reading the first few lines and making the wrong inference.

This works:

psz2_2017 = pd.read_csv('http://cdsarc.u-strasbg.fr/ftp/cats/J/A+A/594/A27/psz2.dat',header = None, delim_whitespace=True, error_bad_lines=False )

The key argument here is error_bad_lines=False

Upvotes: 1

dubbbdan
dubbbdan

Reputation: 2730

How about using pd.read_fwf()?

psz2_2017 =pd.read_fwf('http://cdsarc.u-strasbg.fr/ftp/cats/J/A+A/594/A27/psz2.dat',header=None)

Upvotes: 1

Related Questions