Reputation: 19375
Consider this simple example
df = pd.DataFrame({'dt_one': ['2015-01-01', '2016-02-02'],
'dt_two': ['2015-01-01', '2016-02-02'],
'other_col': [1, 2]})
df
Out[30]:
dt_one dt_two other_col
0 2015-01-01 2015-01-01 1
1 2016-02-02 2016-02-02 2
I want to apply pd.to_datetime
to all the columns that contain dt_
I can easily do so with filter
df.filter(regex = 'dt_').apply(lambda x: pd.to_datetime(x))
Out[33]:
dt_one dt_two
0 2015-01-01 2015-01-01
1 2016-02-02 2016-02-02
However, how can these values be assigned in the original dataframe? Doing:
df.filter(regex = 'dt_') = df.filter(regex = 'dt_').apply(lambda x: pd.to_datetime(x))
File "<ipython-input-34-412d88939494>", line 1
df.filter(regex = 'dt_') = df.filter(regex = 'dt_').apply(lambda x: pd.to_datetime(x))
SyntaxError: can't assign to function call
does not work
Thanks!
Upvotes: 2
Views: 146
Reputation: 323226
You can assign the value like this .
df['dt_one'],df['dt_two']=df.filter(regex = 'dt_').apply(lambda x: pd.to_datetime(x)).values
df.dtypes
Out[215]:
dt_one datetime64[ns]
dt_two datetime64[ns]
other_col int64
dtype: object
Upvotes: 0
Reputation: 30605
The method din't work because df.filter(regex='dt_')
is a modified copy. For assigning data for multiple columns you need to select the columns from the actual dataframe with index based selection or assign it inplace using assign
.
So after filtering get the columns and do a boolean indexing before assigning it i.e
df[df.filter(regex = 'dt_').columns] = df.filter(regex = 'dt_').apply(lambda x: pd.to_datetime(x))
dt_one dt_two other_col 0 2015-01-01 2015-01-01 1 1 2016-02-02 2016-02-02 2
Upvotes: 4
Reputation: 153460
You can use "unpacking" with assign
:
df_out = df.assign(**df.filter(regex = 'dt_').apply(lambda x: pd.to_datetime(x)))
dt_one dt_two other_col
0 2015-01-01 2015-01-01 1
1 2016-02-02 2016-02-02 2
And info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 3 columns):
dt_one 2 non-null datetime64[ns]
dt_two 2 non-null datetime64[ns]
other_col 2 non-null int64
dtypes: datetime64[ns](2), int64(1)
memory usage: 128.0 bytes
Upvotes: 2
Reputation: 862651
You need assign to filtered columns:
cols = df.filter(regex = 'dt_').columns
df[cols] = df[cols].apply(lambda x: pd.to_datetime(x))
print (df)
dt_one dt_two other_col
0 2015-01-01 2015-01-01 1
1 2016-02-02 2016-02-02 2
Or assign to columns selected by mask
:
m = df.columns.str.contains('dt_')
df.loc[:,m] = df.loc[:,m].apply(lambda x: pd.to_datetime(x))
print (df)
dt_one dt_two other_col
0 2015-01-01 2015-01-01 1
1 2016-02-02 2016-02-02 2
Upvotes: 3