itscharlieb
itscharlieb

Reputation: 313

Q: Gmail Api Returning Emails With InternalDates In the Future

I am attempting to use the Gmail api to synchronize all the email's from a user's Gmail inbox. I am using the Partial Synchronization technique described in Gmail's "Synchronizing Clients" [1] documentation. One of the listed limitations of this is that in rare cases the historyId of certain emails are unavailable. Under these circumstances, it is advised that the client fall back on using "Full Synchronization", which states that the client should "retrieve and store as many of the most recent messages or threads as are necessary for your purpose".

This all makes sense. When I have issues with Partial Synchronization, I attempt to look through an inboxes messages by time range. To do this, I effectively store a record of the ( emailAddress, historyId, internalDate ) of each email I sync and then when falling back on Full Synchronization I attempt to sync all email since the most recent internalDate that I have already synced.

My issue is that the cases that seem to cause partial synchronization to fail also seem to cause Full Synchronization to fail, and many of these cases are caused by emails with internalDates in the future (I can't share these examples for privacy reasons). The failure case seems to be something like the following

  1. I sync email E with historyId H and an internalDate I some time in the future
  2. Some time passes
  3. I receive a push notification from google indicating that their are new emails to sync
  4. I lookup the most recent message that I have syncecd for this inboxId, finding email E
  5. I attempt a partial sync using the listHistory [2] endpoint with historyId H
  6. The listHistory request fails with a 404
  7. I attempt a full sync using the listMessages [3] endpoint using the query newer_than:{hours_since-internalDate-I}, but this request doesn't make any sense since the internalDate of this message is in the future.

I can imagine a few different solutions to this problem. Perhaps I should simply ignore these emails as spam, or perhaps I should store a timestamp of when I synced each email and then perform a Full Synchronization on the timestamp I have stored.

Either way, this seems like a bug in the Gmail API, as the internalDate should really be when Gmail received the email. I initially suspected that this might be caused by Gmail's new schedule feature and that the internalDate might be when the email was scheduled in the future, but I confirmed that some of the examples I have are definitely for emails that the user's inbox received, not sent. Really not sure what to make of this edge case within the internalDate api.

So my question is, what is the advised way to handle bogus future internalDates? And is it a bug?

  1. https://developers.google.com/gmail/api/guides/sync
  2. https://developers.google.com/gmail/api/v1/reference/users/history/list
  3. https://developers.google.com/gmail/api/v1/reference/users/messages/list

Upvotes: 2

Views: 1925

Answers (1)

Rafa Guillermo
Rafa Guillermo

Reputation: 15357

If you're sure this is a bug, you can head to Google's Issue Tracker (template here) and report it so their engineering team can take a look and see what is causing this error. Alternatively if this persists with other mails or users, you can open a support ticket directly with them by going to your admin dashboard and selecting 'Contact Support' in the ? menu in the top right. This way Google can take a look into the erroneous internalDates without the need for you to post any potentially sensitive data in a public forum.

In the mean time you can workaround this dynamically by making sure that you don't fetch mails with a time in the future (psuedo-code):

var now = new Date().getTime()
var q = "newer_than:1h before:" + now

GmailServiceConnect.Users.messages.list(userId = "[email protected]", q = q).execute()

But remember that Gmail uses milliseconds for Unix time not seconds so this will have to be adjusted accordingly.

Upvotes: 3

Related Questions