CODEWITHSUNDEEP

pythonpandascsvdataframe

Reputation: 13

How to read .log file in python?

I got this .log file. I don't know how to read them as DataFrame

 id  |        create_date         
-----+----------------------------
 318 | 2017-05-05 07:03:27.556697
 456 | 2017-07-03 01:50:07.966652
 249 | 2017-05-03 13:57:32.567373

Upvotes: 0

Views: 5099

Answers (2)

Reputation: 5294

pd.read_table("data.csv", sep="|", skiprows=[1], header=0, parse_dates=[1]).rename(columns=lambda x: x.strip())

    id                create_date
0  318 2017-05-05 07:03:27.556697
1  456 2017-07-03 01:50:07.966652
2  249 2017-05-03 13:57:32.567373

Parameters

sep="|"

Use | as column separator
skiprows=[1]

Ignore the second row, which is just decorations and would be the most problematic to parse
header=0

Read column names from the first row
parse_dates=[1]

Convert create_date column into pandas datetime64 format (may be optional)
rename(columns=lambda x: x.strip())

Remove extra whitespaces from column names

You may want to add index_col=0 if you want to make id column your index instead of using a sequential one.

Upvotes: 2

Mohamed Thasin ah

Mohamed Thasin ah

Reputation: 11192

try this,

df=pd.read_csv('file_.csv',sep='|')

then you can remove -----+---------------------------- in many ways

df[df[' id ']!='-----+----------------------------']
df[~df[' id '].str.startswith('-')]
df.drop(0) # it won't work if your file contains -----+---------------------------- in any other places for example footer
df[df[' create_date '].notnull()] # it won't work when your create_date column contains NaN by default.

Output:

    id           create_date         
1   318    2017-05-05 07:03:27.556697
2   456    2017-07-03 01:50:07.966652
3   249    2017-05-03 13:57:32.567373

Upvotes: 0

Related Questions