logidelic
logidelic

Reputation: 1695

chrono parse including timezone

I'm using @howard-hinnant 's date library (now part of C++20) to parse a date string that includes a timezone abbreviation. The parse takes place without errors, but it appears that the timezone is ignored. For example:

        istringstream inEst{"Fri, 25 Sep 2020 13:44:43 EST"};
        std::chrono::time_point<std::chrono::system_clock, chrono::seconds> tpEst;
        inEst >> date::parse("%a, %d %b %Y %T %Z", tpEst);
        std::cout <<  chrono::duration_cast<chrono::milliseconds>( tpEst.time_since_epoch() ).count() << std::endl;

        istringstream inPst{"Fri, 25 Sep 2020 13:44:43 PST"};
        std::chrono::time_point<std::chrono::system_clock, chrono::seconds> tpPst;
        inPst >> date::parse("%a, %d %b %Y %T %Z", tpPst);
        std::cout <<  chrono::duration_cast<chrono::milliseconds>( tpPst.time_since_epoch() ).count() << std::endl;

        istringstream inGmt{"Fri, 25 Sep 2020 13:44:43 GMT"};
        std::chrono::time_point<std::chrono::system_clock, chrono::seconds> tpGmt;
        inGmt >> date::parse("%a, %d %b %Y %T %Z", tpGmt);
        std::cout <<  chrono::duration_cast<chrono::milliseconds>( tpGmt.time_since_epoch() ).count() << std::endl;

Produces the output:

1601041483000
1601041483000
1601041483000

Am I doing something wrong, or is the timezone info not used by the parser?

Upvotes: 2

Views: 1872

Answers (1)

Howard Hinnant
Howard Hinnant

Reputation: 218890

Unfortunately there is no way to reliably and uniquely identify a time zone given just a time zone abbreviation. Some abbreviations are used by multiple time zones, sometimes even with different UTC offsets.

So in short, the time zone abbreviation is parsed, but does not identify a UTC offset which can be used to alter the parsed timestamp.

See Convert a time zone abbreviation into a time zone for code that attempts to at least narrow down which timezones are using a specific time zone abbreviation at one time.

Alternatively, if you a parse a UTC offset ("%z" or "%Ez"), then that offset will be applied to the timestamp to convert it to a sys_time.


Fwiw, I ran each of these three examples through the find_by_abbrev overload taking local_time described here. The results are interesting in that they likely confirm the fragility of parsing time zone abbreviations:

"Fri, 25 Sep 2020 13:44:43 EST"

Could be any of these time zones:

2020-09-25 13:44:43 EST 2020-09-25 18:44:43 UTC America/Atikokan 
2020-09-25 13:44:43 EST 2020-09-25 18:44:43 UTC America/Cancun 
2020-09-25 13:44:43 EST 2020-09-25 18:44:43 UTC America/Jamaica 
2020-09-25 13:44:43 EST 2020-09-25 18:44:43 UTC America/Panama 
2020-09-25 13:44:43 EST 2020-09-25 18:44:43 UTC EST 

All of these have a UTC offset of -5h. So in that sense, the UTC equivalent is unique (2020-09-25 18:44:43 UTC as shown above). However one has to wonder if America/Montreal was actually intended, which has a UTC offset of -4h and an abbreviation of EDT on this date.

"Fri, 25 Sep 2020 13:44:43 PST"

Has only one match!

2020-09-25 13:44:43 PST 2020-09-25 05:44:43 UTC Asia/Manila

This has a UTC offset of 8h. But I have to wonder if America/Vancouver was intended, which has a UTC offset of -7h and an abbreviation of PDT on this date.

If one knows the matching UTC offset for the abbreviations one will be parsing, one could parse into a local_time, parse the abbreviation, look up the UTC offset, and apply it to transform the local_time into a sys_time. This library makes it easy to parse the abbreviation along with the timestamp:

local_seconds tpEst;
std::string abbrev;
inEst >> date::parse("%a, %d %b %Y %T %Z", tpEst, abbrev);
sys_seconds tpUTC{tpEst - local_seconds{} - get_offset(abbrev)};

where get_offset(abbrev) is a custom map you've written to return offsets given a time zone abbreviation. Note that this wouldn't help if (for example) EDT (-4h) was intended but EST (-5h) was parsed.

Another possible strategy is to write a map of abbreviations to time zone names (instead of to offsets). For example: Both "EST" and "EDT" could map to "America/Toronto", and then you could do:

local_seconds tpEst;
std::string abbrev;
inEst >> date::parse("%a, %d %b %Y %T %Z", tpEst, abbrev);
zoned_seconds zt{get_tz_name(abbrev), tpEst};
sys_seconds tpUTC = zt.get_sys_time();

Upvotes: 5

Related Questions