PeterL
PeterL

Reputation: 515

Losing values when constructing a Series from a DataFrame column

I have a DataFrame td consisting of the following columns:

In [111]: td.head(5)
Out[111]:
         Date      Time    Price
0  2015-09-21  00:01:26  4303.00
1  2015-09-21  00:01:33  4303.00
2  2015-09-21  00:02:21  4303.50
3  2015-09-21  00:02:21  4303.50
4  2015-09-21  00:02:31  4303.25

My goal is to have a Series with Datetime and Price.

I tried:

s = pd.Series(td['Price'], index=pd.to_datetime(td['Date'] + ' ' + td['Time']))

But get the result:

>>> s
2015-09-21 00:01:26   NaN
2015-09-21 00:01:33   NaN
2015-09-21 00:02:21   NaN
2015-09-21 00:02:21   NaN
                       ..
2015-09-25 16:59:58   NaN
2015-09-25 16:59:58   NaN
2015-09-25 16:59:58   NaN
2015-09-25 16:59:59   NaN
Name: Price, dtype: float64

All the values from "Prices" are NaN. Any hint what I am doing wrong?

Upvotes: 4

Views: 388

Answers (1)

Alex Riley
Alex Riley

Reputation: 176850

When creating a Series from a DataFrame column and passing in an index, the column will be reindexed according to the new index.

In your case, none of the labels in the newly created Datetime index were originally used to index the column td['Price'], so a Series of missing (NaN) values is returned.

The easiest solution is to pass in td['Price'].values instead:

>>> pd.Series(td['Price'].values, index=pd.to_datetime(td['Date']+' '+td['Time'])
2015-09-21 00:01:26    4303.00
2015-09-21 00:01:33    4303.00
2015-09-21 00:02:21    4303.50
2015-09-21 00:02:21    4303.50
2015-09-21 00:02:31    4303.25
...

Using td['Price'].values means that the values from the column are in a NumPy array: this has no index and pandas does not try to reindex the values.

Upvotes: 2

Related Questions