Reputation: 11460
How can I split the a column into two separate ones. Would apply be the way to go about this? I want to keep the other columns in the DataFrame.
For example I have a column called "last_created" with a bunch of dates and times: "2016-07-01 09:50:09"
I want to create two new columns "date" and "time" with the split values.
This is what I tried but it's returning an error. For some reason my data was getting converted from str to float so I forced it to str.
def splitter(row):
row = str(row)
return row.split()
df['date'],df['time'] = df['last_created'].apply(splitter)
Error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-47-e5a9cf968714> in <module>()
7 return row.split()
8
----> 9 df['date'],df['time'] = df['last_created'].apply(splitter)
10 df
11 #splitter(df.iloc[1,1])
ValueError: too many values to unpack (expected 2)
Upvotes: 1
Views: 667
Reputation: 862661
You can first convert to_datetime
if dtype
is object
and then use dt.date
and dt.time
:
df = pd.DataFrame({'last_created':['2016-07-01 09:50:09', '2016-07-01 09:50:09']})
print (df)
last_created
0 2016-07-01 09:50:09
1 2016-07-01 09:50:09
print (df.dtypes)
last_created object
dtype: object
df['last_created'] = pd.to_datetime(df.last_created)
print (df.dtypes)
last_created datetime64[ns]
dtype: object
df['date'], df['time'] = df.last_created.dt.date, df.last_created.dt.time
print (df)
last_created date time
0 2016-07-01 09:50:09 2016-07-01 09:50:09
1 2016-07-01 09:50:09 2016-07-01 09:50:09
Upvotes: 1
Reputation: 1727
The following should work for you. However, storing the date and time as timestamp is much convenient for manipulation.
df['date'] = [d.split()[0] for d in df['last_created']]
df['time'] = [d.split()[1] for d in df['last_created']]
Upvotes: 1
Reputation: 459
In my cases, I just using the function. ipython source code is below.
In [5]: df = dict(data="", time="", last_created="")
In [6]: df
Out[6]: {'data': '', 'last_created': '', 'time': ''}
In [7]: df["last_created"] = "2016-07-01 09:50:09"
In [8]: df
Out[8]: {'data': '', 'last_created': '2016-07-01 09:50:09', 'time': ''}
In [9]: def splitter(row):
...: row = str(row)
...: return row.split()
In [10]: df["data"], df["time"] = splitter(df["last_created"])
In [11]: df
Out[11]:
{'data': '2016-07-01',
'last_created': '2016-07-01 09:50:09',
'time': '09:50:09'}
Upvotes: 1