Reputation: 227
I have a pandas dataframe with date information stored as a string. I want to extract the month from each date directly, so I tried this:
import pandas as pd
df = pd.DataFrame([['2015-04-16', 5], ['2014-05-01', 6]],columns = ['date','units'])
df['month'] = df['month'].str[5,7]
print(df)
This gives the following output
date units month
0 2015-04-16 5 NaN
1 2014-05-01 6 NaN
The dtype for the NaN's is float, and I have no idea why. Why doesn't this just create another column with the substrings?
Upvotes: 2
Views: 253
Reputation: 365657
If you're trying to slice each string to get the substring from 5 to 7, you need a :
, not a ,
:
>>> df = pd.DataFrame([['2015-04-16', 5], ['2014-05-01', 6]],columns = ['date','units'])
>>> df['month'] = df['date'].str[5:7]
>>> print(df)
date units month
0 2015-04-16 5 04
1 2014-05-01 6 05
Upvotes: 3
Reputation: 393963
I think your problem is that your slicing is invalid:
In [7]:
df = pd.DataFrame([['2015-04-16', 5], ['2014-05-01', 6]],columns = ['date','units'])
df['date'].str[5,7]
Out[7]:
0 NaN
1 NaN
Name: date, dtype: float64
Compare with this:
t='2015-04-16'
t[5,7]
this raises a:
TypeError: string indices must be integers
I think you wanted:
In [18]:
df = pd.DataFrame([['2015-04-16', 5], ['2014-05-01', 6]],columns = ['date','units'])
df['month'] = df['date'].str[5:7]
df
Out[18]:
date units month
0 2015-04-16 5 04
1 2014-05-01 6 05
So as this is an invalid operation pandas is returning NaN
Upvotes: 1