Reputation: 111
I am making a script to print some labels and need to import a excell file with ID numbers and DOBs. The DOBs are already formatted correctly in the spreadsheet, but when I import it to a dataframe it converts it to date time and adds a time stamp.
Spreadsheet input
IDS DOB
1 10/21/18
then the following code
df = pd.read_excel(file,sheetname = "Sheet1",dtype=str )
Gives this DataFrame output
IDS DOB
1 2018-10-21 00:00:00
Even though I told it to import as a string.
Some rows don't have DOBs or have notes so when I try to use an apply function like
df['DOB'] = df['DOB'].apply(lambda x: dt.datetime.strptime(x,'%D%M%Y'))
I get
ValueError: unconverted data remains: -10-10 00:00:00
when I try
df['DOB'] = pd.to_datetime(df['DOB'],format="%m/%d/%Y")
It still outputs as
2018-10-21 00:00:00
Upvotes: 1
Views: 1060
Reputation: 142641
If you have all dates as strings then you could convert to datetime and format it to string again
df['DOB'] = pd.to_datetime(df['DOB']).dt.strftime('%m/%d/%y')
Minimal working example
import pandas as pd
df = pd.DataFrame({
'IDS': [1],
'DOB': ['2018-10-21 00:00:00'],
})
print(df)
df['DOB'] = pd.to_datetime(df['DOB']).dt.strftime('%m/%d/%y')
print(df)
If there are other strings then you may need function which use try/except
to catch error when you try to convert these values and return original string.
import pandas as pd
df = pd.DataFrame({
'IDS': [1, 2],
'DOB': ['2018-10-21 00:00:00', 'Hello World'],
})
print('\n--- before ---\n')
print(df)
def convert(value):
try:
return pd.to_datetime(value).strftime('%m/%d/%y')
except Exception as ex:
print(ex)
return value
df['DOB'] = df['DOB'].apply(convert)
print('\n--- after ---\n')
print(df)
Result:
--- before ---
IDS DOB
0 1 2018-10-21 00:00:00
1 2 Hello World
('Unknown string format:', 'Hello World')
--- after ---
IDS DOB
0 1 10/21/18
1 2 Hello World
BTW: I don't have Excel file for test but maybe if you use this convert()
with parse_dates
in read_excel
then it convert it when you read file.
Upvotes: 1