mikemsq
mikemsq

Reputation: 71

Pandas Series constructor produces NaN values

Pandas Series constructor produces NaN values when passed a dictionary where the key is a tuple with one of the elements as a datetime. The code is below.

Strangely, it's not happening when the key is a single datetime, or a tuple without datetimes.

It seems this behavior was introduced in pandas 0.15.0, since it works fine in 0.14.1, however I can't find anything in the release notes.

I'm running 64-bit Python 2.7 on Windows.

Any help is appreciated.

import datetime
import pandas as pd

d = {
    (datetime.date(2016, 5, 1), 'k1'): 1,
    (datetime.date(2016, 5, 2), 'k2'): 2
}

print 'Dictionary:'
print d
print

s = pd.Series(d)
print 'Series:'
print s
print

df = pd.DataFrame(d.values(), index=pd.MultiIndex.from_tuples(d.keys()))
print 'DataFrame:'
print df
print

Output:

Dictionary:
{(datetime.date(2016, 5, 1), 'k1'): 1, (datetime.date(2016, 5, 2), 'k2'): 2}

Series:
2016-05-01  k1   NaN
2016-05-02  k2   NaN
dtype: float64

DataFrame:
               0
2016-05-01 k1  1
2016-05-02 k2  2

Upvotes: 1

Views: 430

Answers (1)

piRSquared
piRSquared

Reputation: 294328

That is bizarre! Has to be a bug.

Here are some of my experiments:

What you did:

s = pd.Series({(datetime.date(2016, 5, 1), 'k1'): 1,
               (datetime.date(2016, 5, 2), 'k2'): 2})

s

2016-05-01  k1    NaN
2016-05-02  k2    NaN
dtype: float64

Experiment #1: use strftime Returns a string, not what you want. But works.

s = pd.Series({(datetime.date(2016, 5, 1).strftime('%Y-%m-%d'), 'k1'): 1,
               (datetime.date(2016, 5, 2).strftime('%Y-%m-%d'), 'k2'): 2})

s

2016-05-01  k1    1
2016-05-02  k2    2
dtype: int64

Experiment #2: use pd.to_datetime. This works

s = pd.Series({(pd.to_datetime(datetime.date(2016, 5, 1)).strftime('%Y-%m-%d'), 'k1'): 1,
               (pd.to_datetime(datetime.date(2016, 5, 2)).strftime('%Y-%m-%d'), 'k2'): 2})

s

2016-05-01  k1    1
2016-05-02  k2    2
dtype: int64

Experiment #3: use pd.Timestamp. This also works

s = pd.Series({(pd.Timestamp(datetime.date(2016, 5, 1)), 'k1'): 1,
               (pd. Timestamp(datetime.date(2016, 5, 2)), 'k2'): 2})

s

2016-05-01  k1    1
2016-05-02  k2    2
dtype: int64

Upvotes: 1

Related Questions