Making WordPress.org

#6920 closed defect (bug) (fixed)

Detect invalid plugin committer email addresses prior to sending bulk emails

Reported by: dd32's profile dd32 Owned by:
Milestone: Priority: low
Component: Plugin Directory Keywords:
Cc:

Description

The Plugins team send out an email just before every WordPress release along the lines of "WordPress 6.2 is imminent! Are your plugins ready?", this email goes out to every current plugin committer.

The result of that email is a LOT of bounced emails, usually this is from people leaving a company or having a domain expire and having never closed their WordPress.org account or been removed as a committer of a plugin.

The bounce notification emails then go to the plugin directory helpscout instance, which quickly fills up with hundreds of emails that need to be mostly manually processed.

To combat this, it would be good to be able to validate existing committer email addresses prior to sending emails out, and when the plugin doesn't have any other committers, close the plugin automatically.

Validating email addresses is not a perfect science, it's easy to validate some aspects of it (Domain is valid, Domain has MX record), a little harder for others (Verify that the email server listed will accept email for the account), and downright impossible to detect some others (Email address is a forwarder, which forwards to a non-existent mailbox).

A Google for 'Email validation checker' will return a multitude of online services which can do this.
We should consider either using such a service, or running some of the simpler checks directly to reduce the number of emails we ultimately send, reducing the amount of emails that the plugins team needs to manually process.

This will also ensure that when we need to contact a plugin author for some reason, we actually have a method of communication.

Change History (9)

#1 follow-up: @nosilver4u
18 months ago

As someone who has done similar things to prevent customers from entering invalid emails, I'd say email validation services are very hit and miss, and also a bit expensive.
As noted, the idea of MX validation is a good minimum and easy to implement. Testing for an SMTP connection is a little trickier, but the main issue I've run into (in 2-3 years) is having the validation server blocked by the recipient server using "clever" spam protection methods.
In order to avoid that, I've found it useful to maintain a list of "known good" email domains AND MX records that don't need further verification. For example, with folks using hosted gmail or iCloud (especially the latter) with custom domains, verifying that their MX matches the standard domains for those services is about as good as you can get.
Happy to provide any code samples or lists if you like!

#2 @dufresnesteven
18 months ago

Does it make sense to list the strategies by effort/value? Was there any data recorded on the outcomes of a specific release? IE, how many were because of invalid domain, email typo, etc...?

#3 in reply to: ↑ 1 @dd32
18 months ago

Replying to nosilver4u:

As someone who has done similar things to prevent customers from entering invalid emails, I'd say email validation services are very hit and miss, and also a bit expensive.

Thanks for the feedback @nosilver4u! That's very helpful to know.

Was there any data recorded on the outcomes of a specific release?

@dufresnesteven not exactly, although that's a very good question to ask.. Searching helpscout has revealed these error messages:

  • 87x "The email account that you tried to reach does not exist." (ie. Domain + MX records exist, user does not).
  • 16x "User unknown" (Same as above)
  • 32x "Recipient address rejected" (Same as above, but also see:)
  • 15x "550 5.4.1 Recipient address rejected: Access denied" - that's from mail.protection.outlook.com which appears to be a "Domain doesn't accept email from your domain" error or "Email service is suspended".
  • 68x "Host or domain name not found." - ie. Domain no longer exists, includes some outlook "No longer uses our services" errors.

That's ~220 bounces, there's probably another major error code in here that I haven't spotted.

I agree that validating users email inboxes are accessible might be a step too far if it's not at all reliable enough. But even if just that 30% of non-existent domains didn't have to be handled manually, that would be a benefit.

It might be a case where a better solution is to just keep adding more automation to the "we got a bounce! Close it" process than pre-empting it.

#4 follow-up: @alanfuller
18 months ago

The primary issue you stated is helpscout rapidly filling up?

What about having a periodic email e.g every 6 months, 'check' email that doesn't feed helpscout.

"Hey your email for you as committer if xyz is abc@…, is this still correct, if not please login and change"

Then you can read and process hard bounces, auto remove committers with hard bounces and close plugins with no committers.

#5 in reply to: ↑ 4 @dd32
18 months ago

Replying to alanfuller:

The primary issue you stated is helpscout rapidly filling up?

Pretty much; The emails go out a few weeks before each major WordPress release, and processing the bounces has to be done effectively manually. So it's 3-4 times a year with 200-400 bounce emails each time.

Due to some shortcomings in our email systems, processing the bounces in the way you're suggesting isn't entirely viable, or I didn't think it was.

An alternative to this of pre-empting the bounces would indeed to be to have a script processing incoming HelpScout emails (As we already do) and performing actions based on that email at that point in time.

One of the problems we have is that quite often the bounce doesn't directly identify the destination email address (or is confused as to what it was, as it got redirected elsewhere) and due to HelpScouts shortcomings adding additional data into the Bounce Address header isn't viable. What I mean by shortcomings, is that you can't use plus addressing (ie. plugins+bounce-$userid@).

I was really hoping to avoid the emails ever going to HelpScout to be honest, but using it as our email-to-PHP handler might not be the worst outcome.

#6 @dd32
18 months ago

Given the above discussion, there's a few options:

  1. Detect invalid/expired domains (plus potentially those with no mail server defined - ie no MX records and the A not having port 25 open) prior to emailing
  2. ^ plus attempting to send an email via RCPT TO and see if it's accepted - This is likely too much, and likely to get us future email problems if a server feels it's spam or malicious. There's a not insignificant risk of false positives
  3. Script it, either have a Meta dev run a bin script or run it on the post-email-receive hook, let the email bounce but have the majority handled automatically.

I'm leaning towards #3 - That would mean we wouldn't need to work on this until the next bulk email is sent, but we can likely simulate bounces (or have it use the previous bounces) to check it works as expected.

Last edited 18 months ago by dd32 (previous) (diff)

#7 @dd32
15 months ago

In 12760:

Plugin Directory: Add a bin script to process email bounces in HelpScout.

This bin script can be run after bulk emails to automatically revoke commit or close plugins as required when the authors emails are bouncing.

Data is logged to allow plugin reviewers to see why an action was taken, and the Author Note (visible to plugin committers) explains the next steps if their plugin was closed.

See #6920.

#8 @dd32
15 months ago

In 12762:

Plugin Directory: Fix a typo in [12760].

Props frantorres.
See #6920.

#9 @dd32
15 months ago

  • Resolution set to fixed
  • Status changed from new to closed

I'm closing this as fixed with the above.

  • Email bounces can be processed with the bin script above after bulk emails go out.
  • Email auto-responders are harder to automate, I've setup a HelpScout workflow to mark some of them as out-of-office which can be handled easier. Some are often "This person no longer works for us" or "I don't respond to emails to this email address" which need to be handled by a human.
Note: See TracTickets for help on using tickets.