Mandy
Mandy

Reputation: 31

Parsing dates from two different rows in Pandas

I have Date and timestamps in two different rows in Pandas DataFrame. Any pointer how to parse date and time together in one row which can be further used for time-series analysis further?

e.g.

Row 1 Date            2017-12-11 00:00:00   2017-12-11 00:00:00     2017-12-11 00:00:00     2017-12-11 00:00:00 

Row 2 Timestamp             01:00:00              02:00:00                03:00:00              04:00:00 

and then some more rows having more data

Can Row 1 and Row 2 be combined together to have complete date/timestamp information together?

I was thinking of applying Transpose and then using parse_dates on columns. Is there any other direct way of doing that in python?

Upvotes: 2

Views: 523

Answers (2)

jpp
jpp

Reputation: 164843

If you are working with time series data, I strongly suggest you make the datetime component your index. You may find operations more efficient when your dataframe has an index containing non-duplicated values.

Your idea of transposing the dataframe first is good. Here's a minimal example:

df = pd.DataFrame([['2017-12-11 00:00:00', '2017-12-11 00:00:00',
                    '2017-12-11 00:00:00', '2017-12-11 00:00:00'],
                   ['01:00:00', '02:00:00', '03:00:00', '04:00:00'],
                   [1, 2, 3, 4], [5, 6, 7, 8]],
                  index=['Date', 'Timestamp', 'Data1', 'Data2'])

df = df.T
df.index = pd.to_datetime(df.pop('Date')) + pd.to_timedelta(df.pop('Timestamp'))

Resulting dataframe:

print(df)

                    Data1 Data2
2017-12-11 01:00:00     1     5
2017-12-11 02:00:00     2     6
2017-12-11 03:00:00     3     7
2017-12-11 04:00:00     4     8

You now have a DatetimeIndex:

print(df.index)

DatetimeIndex(['2017-12-11 01:00:00', '2017-12-11 02:00:00',
               '2017-12-11 03:00:00', '2017-12-11 04:00:00'],
              dtype='datetime64[ns]', freq=None)

Upvotes: 0

jezrael
jezrael

Reputation: 863751

I think best is transpose DataFrame first for columns from rows for same dtypes per columns:

df = df.T

And then convert column Date by to_datetime and add Time converted to_timedelta:

df['dates'] = pd.to_datetime(df['Date']) + pd.to_timedelta(df['Time'])

Upvotes: 1

Related Questions