Reputation: 995
In Python3 and pandas I have a dataframe with a column of strings representing dates - "DataFim" column
df_lotacoes.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 52725 entries, 0 to 52724
Data columns (total 5 columns):
DataFim 48854 non-null object
DataInicio 52725 non-null object
IdUA 52725 non-null object
NomeFuncionario 52725 non-null object
NomeUA 52725 non-null object
dtypes: object(5)
memory usage: 1.0+ MB
print(df_lotacoes['DataFim'])
DataFim
0 2018-11-05T00:00:00-02:00
1 2008-08-28T00:00:00-03:00
2 2002-08-08T00:00:00-03:00
3 2007-03-14T00:00:00-03:00
4 2005-05-06T00:00:00-03:00
I tried to convert to date, but it remains as object
df_lotacoes['DataFim'] = pd.to_datetime(df_lotacoes['DataFim'])
DataFim
0 2018-11-05 00:00:00-02:00
1 2008-08-28 00:00:00-03:00
2 2002-08-08 00:00:00-03:00
3 2007-03-14 00:00:00-03:00
4 2005-05-06 00:00:00-03:00
DataFim 48854 non-null object
I just need the year, month and day information. The other time data I want to ignore
Please, does anyone know how I can convert this format?
Upvotes: 1
Views: 778
Reputation: 38415
Extract date part using str.extract and convert to datetime,
df['DataFim'] = pd.to_datetime(df['DataFim'].str.extract('(.*)T')[0], format = '%Y-%m-%d')
DataFim
0 2018-11-05
1 2008-08-28
2 2002-08-08
3 2007-03-14
4 2005-05-06
Option 2: You can also use str.split
df['DataFim'] = pd.to_datetime(df['DataFim'].str.split('T').str[0], format = '%Y-%m-%d')
Having some fun with regex,
df['DataFim'] = pd.to_datetime(df['DataFim'].str.replace('T.*', '', regex = True), format = '%Y-%m-%d')
Upvotes: 3