Kamran
Kamran

Reputation: 1

How to correctly open txt timeseries files in pandas that has a comma and unseparated date in timestamp?

I have a dataset with txt files that contain timestamps with a comma. The data looks sometime like this

TimeStamp, open, high, low, close, volume
20220401,00:00:00,1.31457,1.31468,1.3141,1.31428,141
20220401,00:01:00,1.31429,1.3144,1.3139,1.31405,157
20220401,00:02:00,1.31409,1.3142,1.31369,1.31405,120

What would be the most efficient way to parse dates in pandas?

I want to merge the date and time columns and convert it into datetimeindex

Upvotes: 0

Views: 31

Answers (1)

Chrysophylaxs
Chrysophylaxs

Reputation: 6583

In your case, I would just assume that instead of a single column that contains a comma, you have two columns: one with the date, one with the time. Currently the first column is being read as the index. You can create a DatetimeIndex by using that index and the TimeStamp column (the time values):

import pandas as pd

df = pd.read_clipboard(sep=",") # Insert pd.read_csv here ;)

idx = pd.DatetimeIndex(
    pd.to_datetime(
        df.index.astype(str) + " " + df["TimeStamp"]
    )
)

out = df.set_index(idx).drop(columns="TimeStamp")
                        open     high      low    close   volume
2022-04-01 00:00:00  1.31457  1.31468  1.31410  1.31428      141
2022-04-01 00:01:00  1.31429  1.31440  1.31390  1.31405      157
2022-04-01 00:02:00  1.31409  1.31420  1.31369  1.31405      120

Upvotes: 1

Related Questions