Product | Gmail & Outlook email reply tracking |
Expert(s) | David Zhou (CRM team) |
Slack channel | |
This article was last verified on | 05/05/2024 |
đ Articles in This Section
Please use the following list to see additional internal articles regarding the SuccessFactors integration:
- (Internal) Email Reply Tracking: Overview
- (Internal) Email Reply Tracking Bugs (đyou are here)
This is a playbook for debugging when a sequence continues to send after a candidates replies. (In other words, the sequence was incorrectly not marked as replied.)
This is one of the most serious categories of bugs, but at the same time, these bug reports are user error the vast majority of the time. Itâs imperative that you very quickly determine whether this is user error or a real bug, and escalate to engineering as soon as possible if you think thereâs even a small chance that this is a real bug.
If a Gem employee is reporting an issue where a sequence isnât marked as replied, but it is before the next stage is scheduled to send, this may not be a real issue
See Gem-specific reply tracking issue for more details
When this playbook applies
This playbook was originally written with Gmail in mind. (That is, the sending user/the person with the ZenSourcer account is on Gmail.) If the sending user is on Outlook, refer to the Outlook section below.
Another warning: if the sender of the sequence was a ZenSourcer employee, then everything is slightly different and this playbook wonât apply (because of how prod and beta interact with each other and our push cursors).
Our reply-detection logic
A received email message will mark a candidate as replied if:
- The email is not from bot@zensourcer.com
- The email is not from the userâs email address, or any address they have as an alias in Gmail
- One of the following is true:
- The in-reply-to header of the reply matches the sent_message_id of a message in our sent_message table for the current userâ .
- The Gmail threadId of the reply matches the sent_gmail_thread_id of a message in our sent_message table for the current userâ .
- The from header of the reply matches (case-insensitive, but dots do matter) the to_email of a message in our sent_message table for the current userâ .
- The x-original-sender header of the reply matches (case-insensitive) the to_email of a message in our sent_message table for the current userâ .
- â For the current user means that sent_on_behalf_of_user_id is None and user_id matches, or that sent_on_behalf_of_user_id matches.
Itâs worth noting that once a message is categorized as a âreplyâ, we mark the candidate as replied. This distinction is meaningful because we could have detected a reply based on an old sequenceâs message id or thread id, but even in this case, we will still mark the candidate as replied if there happens to be an active sequence from the user to the candidate. In other words, any reply from candidate C detected in the inbox of user U will stop sequences between U and C.
Support playbook
Because we want to debug these issues as quickly as possible, we recommend doing most of the digging yourself, rather than waiting a long time for responses from the user.
Support should:
- Ask the user for:
- the sequence name
- the personâs name and email
- the email address they replied from
- who they replied to (sometimes the user who contacts support is not the same user!)
- the date and time of the reply
- Next up, look up the following information:
- Use this to look up the sequence_id and person_id (you can get this from assuming the user, looking at the urls, and base64 decoding)
- Use this to look up the person_sequence_info_id (from the person_sequence_info table in Numeracy)
- Confirm that replied_timestamp is null. If itâs not, then the user might be wrong, and maybe we did catch the reply. (See âreplied_timestamp is not null,â below.)
- Next, use Numeracy to find all sent_message rows matching this person_sequence_info_id. This will tell you how many messages we sent, when, and their to_email/sent_gmail_id/sent_gmail_thread_id/sent_message_id/user_id /sent_on_behalf_of_user_id (which youâll need to know when evaluating our reply-detection logic rules, above).
- [Donât block on this step!] If it would be helpful, ask the user for the full mail headers of the reply. (In Gmail, click ââŚâ and then âShow original,â and have them copy/paste that entire page.) But again, itâs better to get minimal information quickly rather than complete information more slowly.
- Use the âWhat happened to this message queryâ on the Push Cursor board in Honeycomb to search the message id and determine what processing step occurred
- Use GraphQL to look up the mail headers, and Gmailâs ID for this email:
- Be VERY CAREFUL before running this! This is a very sensitive GraphQL endpoint, and we want to make sure nobody abuses it.
- The GraphQL query is:
query {
messagesByUser(userId: XXX, queryString: "XXX")
}
select * from all_email_metadata where id in (select all_email_metadata_id from email_address_to_email_mapping where email='XXX' and team_id=XXX and role='SENDER') order by id asc;
- Warning: this table returns base 10-encoded gmail_id/gmail_thread_id, while everything else expects base 16-encoding, so either convert to hex before querying Honeycomb/etc, or use the message_id column of the results instead.
- Once you have the message, take a note of the history_id and check it against this user gmail_push_cursor information. If the messageâs history_id is less than last_push_history_id, then our processes wouldnât have ever attempted to sync this message. So the problem is that we are not getting gmail push notifications for this user. There are currently no definitive reasons on why this would happen.
- Search Honeycomb for this user ID and gmail_message_id. To do this, go to https://ui.honeycomb.io/zensourcer/datasets/heroku-logs/ and set the following search parameters:
- At the very top, change the time window from âLast 2 hoursâ to something that will cover the date the email was received (perhaps âLast 7 daysâ).
- In the âFilterâ box, add:
- event_type = push_cursor_syncer_run
- user_id = XXX
- gmail_message_id = XXX
- Then click âRun queryâ
- There should generally be 3 rows of results, with push_cursor_type set to DEFAULT, GREENHOUSE, and METADATA. We care only about the DEFAULT row.
- For the DEFAULT row, look at result_category and result_description.
- The possible categories are: error if we errored, noop if we think this message is not a sequence reply, and reply if we think it is a sequence reply.
- Try to figure out why the reply wasnât detected. The vast majority of the time, this is user error. (For instance, if the candidate replies in a brand-new thread without in-reply-to headers from a different email address, we canât catch it.)
- If you canât figure it out, escalate to the eng oncall immediately.
- Because of how high-priority this type of bug is, make sure that the oncall acknowledges the issue. Simply pinging them on Slack isnât enough â make sure they see your message, and respond to it to confirm that theyâre investigating. (If youâre having trouble getting the attention of the oncall, feel free to page them by using the /page-oncall slash-command in the #eng channel in Slack.)
- Even if you think youâve discovered the reason, you should send the results of your investigation to the eng oncall for a quick confirmation before we close out the issue as user error.
TODO: document some common classes of user error, and what they look like
Situation:replied_timestampis not null
Convert the timestamp to a human-readable time to figure out when we detected a reply.
If the timestamp is after they emailed support, itâs possible that they forwarded the sequence thread to support@zensourcer.com, and then we replied (or Intercom automatically replied), and our logic above (in-reply-to header) caused the sequence to be marked as replied.
You can also check the tracking_event table in Numeracy, searching by person_sequence_info_id. Look for an event_type of REPLIED; the event_payload column should say what happened (one of IN_REPLY_TO, THREAD_ID, FROM_EMAIL, or if the user manually âmarked as repliedâ then USER_MANUAL).
Situation: sequence was sent from an alias, but reply was received by a different user
If we have a situation where a user has a sequence that is sending as someone else, via a Gmail alias, but they arenât receiving emails to that alias in their mailbox, we will not be able to track replies correctly. We do try to alert users to this situation when picking an alias in the sequence wizard, but our detection of this situation is not perfect. This will happen when the Honeycomb logs show we processed a reply from the candidate but the result was a noop.
Situation: all emails in a sequence bounced
Likely what happened is the user added a CC or BCC address that bounced and we marked the replies as bounced. Check one of the sent_messages for a CC/BCC address and then query for that address using the message query above. If you see bounced replies from the CC/BCC address, ask them to remove that address and attempt to resend the sequence.
Eng playbook
TODO: expand this section
Some more debugging tools available to eng:
- The relevant code to look at is PushCursorSyncer.py
- scripts/showthread.py will query the Gmail API for the entire thread, given a person_sequence_info_id
- scripts/showmessage.py will query the Gmail API for a message, given its gmail_message_id
- You can also query the all_email_metadata table, but querying the Gmail API directly (via GraphQL, see above) is probably better because it gives all headers, and because the default push cursor syncer and the metadata push cursor syncers are different, so itâs possible only one of them processed the message. (Honeycomb logs should tell you if this happened; see above.)
Gem-specific reply tracking issue
Due to users who work at Gem having accounts on both the beta and production tiers, there is an issue that occasionally comes up where pushes from Gmail are received by the beta tier, and not the production tier. This results in the sequence not being marked as replied until the next time something forces that userâs push cursor to be synced (perhaps due to sending another sequence). This will not result in sequences continuing to send even after a reply is received, since when we try to send the next stage we will sync the userâs push cursor, even if we havenât received Gmail pushes yet, which will process the reply.
If you look up the gmail_message_id in Honeycomb and the corresponding row where push_cursor_type is DEFAULT also has an env of beta, that is an indication that this is the root cause of the reply tracking bug you are investigating.
Outlook playbook
- The details in the support playbook above generally apply to Outlook as well as Gmail. Some deviations are called out below:
- When looking at Honeycomb logs, filter by event_type = msft_push_cursor_syncer_run and narrow down results with user_id. Some helpful fields to look at:
- time_diff_between_processing_and_msg lets you know the lag between when replies are received and when theyâre processed by Gem
- msft_thread_id matches sent_message.sent_gmail_thread_id and can be used to query the Microsoft API for the full conversation thread (more on this below)
- msft_message_id is the message id that can be used to query the Microsoft API for details on that specific email
- Once you know the thread/message IDs, you can query the Microsoft API using scripts/debug/msft_graph_api.py. Sample queries:
- Tip: remember to escape filter inputs using urllib.parse.quote
- python scripts/debug/msft_graph_api.py âuser-id gem_user_id âapi-url beta/me/messages?$filter=sender/emailAddress/address+eq+âescaped_email_addressâ -p
- Returns all messages from the email address to the Gem user. Lets you verify if the user actually received the reply that they claim to be missing. If they did not receive the reply, then they might have an inbox misconfiguration issue
- python scripts/debug/msft_graph_api.py âuser-id gem_user_id âapi-url beta/me/messages?$filter=conversationId+eq+âescaped_msft_thread_idâ -p
- Returns all messages in the conversation thread. Use the value from sent_message.sent_gmail_thread_id
- If you are able to fetch the missing reply via the Microsoft API, the next step may be verifying if server/MsftPushCursorSyncer.py is processing it correctly
- When this playbook applies
- Our reply-detection logic
- Support playbook
- Eng playbook
- Gem-specific reply tracking issue
- Outlook playbook