WKK
WKK

Reputation: 179

Read tabular file in Python

I would like to read the datafile using python delimiter '\t' However, data were not delimited. (I tried to this with delimiter '\t' and ' '.)

  1. What is the condition to recognize as a tab?
  2. How to solve this issue without modifying data file?

DATA FILE

 0       303567       3584       Write       0.000000
 1       55590       3072       Write       0.000000
 0       303574       3584       Write       0.026214
 1       240840       3072       Write       0.026214
 1       55596       3072       Read       0.078643
 0       303581       3584       Write       0.117964
 1       55596       3072       Write       0.117964
 0       303588       3584       Write       0.530841
 1       55596       3072       Write       0.530841
 0       303595       3584       Write       0.550502
 1       240840       3072       Write       0.550502
 1       55602       3072       Read       0.602931
 0       303602       3584       Write       0.648806
 1       55602       3072       Write       0.648806
 0       303609       3584       Write       0.910950
 1       55602       3072       Write       0.910950
 0       303616       3584       Write       0.930611
 1       240840       3072       Write       0.930611
 1       55608       3072       Read       0.983040
 0       303623       3584       Write       1.028915
 1       55608       3072       Write       1.028915
 0       303630       3584       Write       1.330380
 1       55608       3072       Write       1.330380

CODE

with open(datafile, 'rt') as f:
    data = csv.reader(f,delimiter = ' ')
    for d in data:
        pieces.append(d)
        x.append(count)
        count = count+1

RESULTS of print(pieces)

['10', '', '', '', '', '', '', '700132', '', '', '', '', '', '', '512', '', '', '', '', '', '', 'Write', '', '', '', '', '', '', '4186.852539'], ['1', ''^C, '', '', '', '', '', '272774', '', '', '', '', '', '', '1024', '', '', '', '', '', '', 'Write', '', '', '', '', '', '', '4186.852539'], ['7', '', '', '', '', '', '', '273776', '', '', '', '', '', '', '1024', '', '', '', '', '', '', 'Write', '', '', '', '', '', '', '4186.852539']

Upvotes: 1

Views: 2035

Answers (2)

DYZ
DYZ

Reputation: 57033

You can also read it with pandas (which may be useful for further processing):

import pandas as pd
data = pd.read_table('foo.tab', header=None, sep=r'\s+')
#     0       1     2      3         4
#0   0  303567  3584  Write  0.000000
#1   1   55590  3072  Write  0.000000
#2   0  303574  3584  Write  0.026214

Upvotes: 2

Stephen Rauch
Stephen Rauch

Reputation: 49794

This data format can be easily dealt with via:

for line in open(datafile):
    line_data = line.split()
    print(line_data)

Upvotes: 2

Related Questions