HamidL07
HamidL07

Reputation: 1

How to convert a column of datetime with different format to a specific one?

Hi I am not an expert in python and I am still a beginner in using pandas and working with data. I have a df with a column timestamp. The datetime in the column are as shown below:

2021-09-07 16:36:14 UTC 
2021-09-04 15:31:44 UTC
2021-07-15 06:49:47.320081 UTC
2021-09-07 14:55:55.353145 UTC

I would like to have only the date and time, without the UTC text at the end and without the decimals after the second and in the end save the dataframe in a csv file. Basically I want the column in this format:

2021-09-07 16:36:14 
2021-09-04 15:31:44
2021-07-15 06:49:47
2021-09-07 14:55:55

I tried with these two functions:

df['timestamp'] = pd.to_datetime(df['timestamp'], format='%Y-%m-%d %H:%M:%S %Z', errors='coerce')
df['timestamp'] = df['timestamp'].dt.strftime('%Y-%m-%d %H:%M:%S')

I fix half of the problem. The datetime without the decimals after the second get fixed, but the ones with the decimals just get empty, you can find the example below:

2021-09-07 16:36:14 
2021-09-04 15:31:44

Please can someone help me with this problem?

Upvotes: 0

Views: 116

Answers (3)

Freddy Mcloughlan
Freddy Mcloughlan

Reputation: 4496

Try parser, as it can take different formats as an input

from dateutil import parser

# df['timestamp'] = parser.parse(df['timestamp'])
date = parser.parse("2021-07-15 06:49:47.320081 UTC")

print(date)
2021-07-15 06:49:47.320081+00:00

Or this output

# Which would imply
# df['timestamp'] = parser.parse(df['timestamp']).strftime("%F %T")
print(date.strftime("%F %T"))
2021-07-15 06:49:47

Upvotes: 1

Corralien
Corralien

Reputation: 120391

You can take the first 20 characters:

df['timestamp'] = pd.to_datetime(df['timestamp'].str[:19])
print(df)

# Output
            timestamp
0 2021-09-07 16:36:14
1 2021-09-04 15:31:44
2 2021-07-15 06:49:47
3 2021-09-07 14:55:55

If you want to keep the timezone information (UTC), you can remove only the microsecond part:

df['timestamp']= pd.to_datetime(df['timestamp'].str.replace('\.\d+', '', regex=True))
print(df)

# Output
                  timestamp
0 2021-09-07 16:36:14+00:00
1 2021-09-04 15:31:44+00:00
2 2021-07-15 06:49:47+00:00
3 2021-09-07 14:55:55+00:00

Upvotes: 1

norie
norie

Reputation: 9857

Try extracting the part of the field you want.

df['timestamp'] = pd.to_datetime(df['timestamp'].str[:19])
print(df)
print(df.dtypes.

            timestamp
0 2021-09-07 16:36:14
1 2021-09-04 15:31:44
2 2021-07-15 06:49:47
3 2021-09-07 14:55:55


timestamp    datetime64[ns]
dtype: object

Upvotes: 1

Related Questions