Midrar A Adham
Midrar A Adham

Reputation: 11

get days from long timestamp csv file python

I have a csv file with a long timestamp column (years):

1990-05-12 14:01
.
.
1999-01-10 10:00

where the time is in hh:mm format. I'm trying to extract each day worth of data into a new csv file. Here's my code:

import datetime
import pandas as pd 

df = pd.read_csv("/home/parallels/Desktop/ewh_log/hpwh_log.csv",parse_dates=True)

#change timestmap column format

def extract_months_data(df):

    df = pd.to_datetime(df['timestamp'])
    print(df)


def write_o_csv(df):
    print('writing ..')

    #todo



x1 = pd.to_datetime(df['timestamp'],format='%m-%d %H:%M').notnull().all()


if (x1)==True:
    extract_months_data(df)
else:
    x2 = pd.to_datetime(df['timestamp'])

    x2 = x1.dt.strftime('%m-%d %H:%M')

    write_to_csv(df)

The issue is that when I get to the following line

def extract_months_data(df):

    df = pd.to_datetime(df['timestamp'])

I get the following error:

pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime

Is there alternative solution to do it with pandas without ignoring the rest of the data. I saw posts that suggested using coerce but that replaces the rest of the data with NaT.

Thanks

UPDATE:

This post here here answers half of the question which is how to filter hours (or minutes) out of timestamp column. The second part would be how to extract a full day to another csv file. I'll post updates here once I get to a solution.

Upvotes: 0

Views: 168

Answers (1)

tchar
tchar

Reputation: 908

You are converting to datetime two times which is not needed

Something like that should work

import pandas as pd

df = pd.read_csv('data.csv')

df['month_data'] = pd.to_datetime(df['timestamp'], format='%Y-%m-%d %H:%M')
df['month_data'] = df['month_data'].dt.strftime('%m-%d %H:%M')

# If you dont want columns with month_data NaN
df = df[df['month_data'].notna()]

print(df)

Upvotes: 1

Related Questions