Reputation: 1568
This column has bad date/time information in it:
import pandas as pd
df = pd.DataFrame({
'date': ['1/25/201612:00:00AM','2/25/201712:00:00AM','3/25/201812:00:00AM',
'4/25/201912:00:00AM','5/25/201912:00:00AM','6/25/201912:00:00AM']})
I tried this function below, however it produces a column of NaNs:
def insert_space(string, integer):
return string[0:integer] + ' ' + string[integer:]
insert_space(df['date'], 9)
Desired Output example (any date format works!):
date
0 1/25/2016 12:00:00AM
1 2/25/2017 12:00:00AM
2 3/25/2018 12:00:00AM
3 4/25/2019 12:00:00AM
4 5/25/2019 12:00:00AM
5 6/25/2019 12:00:00AM
Or
date
0 1/25/2016
1 2/25/2017
2 3/25/2018
3 4/25/2019
4 5/25/2019
5 6/25/2019
Upvotes: 2
Views: 7784
Reputation: 2349
Applying a function to every row of a specified column works like this:
df['date'].apply(lambda x: insert_space(x, 9), axis=1)
Note that, if working with datetime objects, you'll need to modify the function accordingly. A datetime object, such as datetime.time()
, is not subscriptable and will raise a TypeError if you try and run it through your insert_space
function. str(datetime.time())
will return a string, however.
Upvotes: 3
Reputation: 642
As it stands, the function you provide just returns a value, which is immediately trashed.
Here is a solution using basic for loop (can be trivially converted to list comprehension or functionalized).
import pandas as pd
# First format
df = pd.DataFrame({
'date': ['1/25/201612:00:00AM','2/25/201712:00:00AM','3/25/201812:00:00AM',
'4/25/201912:00:00AM','5/25/201912:00:00AM','6/25/201912:00:00AM']})
for i in range(len(df)):
df['date'][i] = df['date'][i][:-10] + " " + df['date'][i][-10:]
print(df)
# date
# 0 1/25/2016 12:00:00AM
# 1 2/25/2017 12:00:00AM
# 2 3/25/2018 12:00:00AM
# 3 4/25/2019 12:00:00AM
# 4 5/25/2019 12:00:00AM
# 5 6/25/2019 12:00:00AM
# Second format
df = pd.DataFrame({
'date': ['1/25/201612:00:00AM','2/25/201712:00:00AM','3/25/201812:00:00AM',
'4/25/201912:00:00AM','5/25/201912:00:00AM','6/25/201912:00:00AM']})
for i in range(len(df)):
df['date'][i] = df['date'][i][:-10]
print(df)
# date
# 0 1/25/2016
# 1 2/25/2017
# 2 3/25/2018
# 3 4/25/2019
# 4 5/25/2019
# 5 6/25/2019
UPDATE: Below are list comprehensions for the respective calls that should be far more efficient:
df['date'] = [v[:-10] + " " + v[-10:] for v in df['date']]
df['date'] = [v[:-10] for v in df['date']]
Upvotes: 0
Reputation: 447
how about something like this:
df['date'] = pd.to_datetime(df['date'], format="%m/%d/%Y%I:%M:%S%p")
you can find the explanation for the format
here: https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior
Upvotes: 0