user1234440
user1234440

Reputation: 23567

Why isn't read_fwf() output correct content of files?

This is the file content (named sample.txt)

gvkeyx        from        thru    conm                gvkey     co_conm                      co_tic
123453    19661214    19890426    S&P 500 Comp-Ltd    010490    TEXAS EASTERN CORP           PEL4    
123453    19670101           .    S&P 500 Comp-Ltd    001078    ABBOTT LABORATORIES          ABT     
123453    19670101           .    S&P 500 Comp-Ltd    001300    HONEYWELL INTERNATIONAL INC  HON     
123453    19670101           .    S&P 500 Comp-Ltd    001356    ALCOA INC                    AA      
123453    19670101           .    S&P 500 Comp-Ltd    001408    FORTUNE BRANDS INC           FO 

The code I entered to read it:

In [16]: colspecs = [(0, 9), (10, 21), (22, 33), (34, 53), (54, 63), (64, 92), (93, 99)]

In [17]: df = read_fwf('sample.txt', colspecs = colspecs, header=None, index_col=None)

In [18]: df[:2]

Out[19]:      
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 0 to 1
Data Columns:
X.1    2    non-null values
X.2    2    non-null values
X.3    2    non-null values
X.4    2    non-null values
X.5    2    non-null values
X.6    2    non-null values
X.7    2    non-null values
dtypes: object(7)

I am having trouble to understand this output as its entirely different from file. Any comments and advice would help. Thanks

Upvotes: 0

Views: 3781

Answers (1)

Wes McKinney
Wes McKinney

Reputation: 105571

See: http://pandas.pydata.org/pandas-docs/stable/dsintro.html#console-display

It prints a summary because the data is too wide for your terminal. This can be configured with pandas.set_printoptions. You almost certainly need to specify header=0 (this is the default, I believe), so df = read_fwf('sample.txt', colspecs=colspecs) should be sufficient.

Upvotes: 3

Related Questions