Reputation: 772
I have a list of strings formatted as follows: '2/24/2021 3:37:04 PM UTC-6'
How would I convert this?
I have tried
datetime.strptime(my_date_object, '%m/%d/%Y %I:%M:%s %p %Z')
but I get an error saying "unconverted data remains: -6"
Is this because of the UTC-6 at the end?
Upvotes: 1
Views: 275
Reputation: 25489
The approach that @MrFuppes mentioned in their comment is the easiest way to do this.
Ok seems like you need to split the string on 'UTC' and parse the offset separately. You can then set the tzinfo from a timedelta
input_string = '2/24/2021 3:37:04 PM UTC-6'
try:
dtm_string, utc_offset = input_string.split("UTC", maxsplit=1)
except ValueError:
# Couldn't split the string, so no "UTC" in the string
print("Warning! No timezone!")
dtm_string = input_string
utc_offset = "0"
dtm_string = dtm_string.strip() # Remove leading/trailing whitespace '2/24/2021 3:37:04 PM'
utc_offset = int(utc_offset) # Convert utc offset to integer -6
tzinfo = tzinfo = datetime.timezone(datetime.timedelta(hours=utc_offset))
result_datetime = datetime.datetime.strptime(dtm_string, '%m/%d/%Y %I:%M:%S %p').replace(tzinfo=tzinfo)
print(result_datetime)
# prints 2021-02-24 15:37:04-06:00
Alternatively, you can avoid using datetime.strptime
if you extract the relevant components pretty easily with regular expressions
rex = r"(\d{1,2})\/(\d{1,2})\/(\d{4}) (\d{1,2}):(\d{2}):(\d{2}) (AM|PM) UTC(\+|-)(\d{1,2})"
input_string = '2/24/2021 3:37:04 PM UTC-6'
r = re.findall(rex, input_string)
# gives: [('2', '24', '2021', '3', '37', '04', 'PM', '-', '6')]
mm = int(r[0][0])
dd = int(r[0][1])
yy = int(r[0][2])
hrs = int(r[0][3])
mins = int(r[0][4])
secs = int(r[0][5])
if r[0][6].upper() == "PM":
hrs = hrs + 12
tzoffset = int(f"{r[0][7]}{r[0][8]}")
tzinfo = datetime.timezone(datetime.timedelta(hours=tzoffset))
result_datetime = datetime.datetime(yy, mm, dd, hrs, mins, secs, tzinfo=tzinfo)
print(result_datetime)
# prints 2021-02-24 15:37:04-06:00
The regular expression (\d{1,2})\/(\d{1,2})\/(\d{4}) (\d{1,2}):(\d{2}):(\d{2}) (AM|PM) UTC(\+|-)(\d{1,2})
Demo
Explanation:
(\d{1,2})
: One or two digits. Surrounding parentheses indicate that this is a capturing group. A similar construct is used to get the month, date and hours, and UTC offset\/
: A forward slash(\d{4})
: Exactly four digits. Also a capturing group. A similar construct is used for minutes and seconds.(AM|PM)
: Either "AM" or "PM"UTC(\+|-)(\d{1,2})
: "UTC", followed by a plus or minus sign, followed by one or two digits.Upvotes: 1