M. Schmid
M. Schmid

Reputation: 79

changing index in other datetime format in python

I'm new to Python, I hope my question isn't to silly... I want to join to pandas DataFrame (f1 and f3) and it seems that the indices are different.

f1:

DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
           '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08',
           '2018-01-09', '2018-01-10',
           ...
           '2018-12-22', '2018-12-23', '2018-12-24', '2018-12-25',
           '2018-12-26', '2018-12-27', '2018-12-28', '2018-12-29',
           '2018-12-30', '2018-12-31'],
          dtype='datetime64[ns]', name='date', length=365, freq=None)

f3:

Index([2018-01-01, 2018-01-02, 2018-01-07, 2018-03-30, 2018-04-01, 2018-04-02,
   2018-05-01, 2018-05-10, 2018-05-20, 2018-05-21, 2018-06-04, 2018-08-01,
   2018-12-25, 2018-12-26],
  dtype='object')

Now if I join them in order cat = [f1, f3] with
cat_total = pd.concat(cat, axis=1, sort=False) it seems to work and the correct result looks like this:

    print(cat.head())
            weekday       holidays
2018-01-01        0   Neujahrestag
2018-01-02        1  Berchtoldstag
2018-01-03        2            NaN
2018-01-04        3            NaN
2018-01-05        4            NaN

If I change to order of cat like cat = [f3, f1] it doesn't work properly...

print(cat)
                             holidays  weekday
2018-01-01               Neujahrestag        0
2018-01-02              Berchtoldstag        1
2018-01-07                  Test ZH 1        6
2018-03-30                 Karfreitag        4
2018-04-01                     Ostern        6
2018-04-02                Ostermontag        0
2018-05-01             Tag der Arbeit        1
2018-05-10                   Auffahrt        3
2018-05-20                  Pfingsten        6
2018-05-21              Pfingstmontag        0
2018-06-04                  Test ZH 2        0
2018-08-01           Nationalfeiertag        2
2018-12-25                Weihnachten        1
2018-12-26                Stephanstag        2
2018-01-01 00:00:00               NaN        0
2018-01-02 00:00:00               NaN        1
2018-01-03 00:00:00               NaN        2
2018-01-04 00:00:00               NaN        3
2018-01-05 00:00:00               NaN        4
2018-01-06 00:00:00               NaN        5
2018-01-07 00:00:00               NaN        6

Why is that like this? How can I change one of the indices of the pandas DataFrame that the formats are the same?

The f1-index arises from dates = pd.date_range(start = startdate, end = enddate, freq = 'D') and the f3-one is the result of the external package 'holidays'

I hope these are all infos needed. Thanks a lot in advance

Marco

Upvotes: 0

Views: 297

Answers (1)

johnashu
johnashu

Reputation: 2211

you can change the to_datetime to format the column like so:

I assume the column is named DATE

cat_total['DATE'] = pd.to_datetime(cat_total['DATE'],format='%Y-%m-%d', errors='ignore')

to_datetime

Upvotes: 1

Related Questions