WordPress.org

Making WordPress.org

Opened 7 months ago

Last modified 3 days ago

#4126 accepted defect

"Special contributions" template leaks PII

Reported by: jonoaldersonwp Owned by: tellyworth
Milestone: Priority: high
Component: Codex Keywords: seo analytics privacy
Cc:

Description

E.g., https://codex.wordpress.org/Special:Contributions/Jany2786@gmail.com

This template should have a meta robots value of 'noindex, follow'.

Change History (8)

#2 @Otto42
7 months ago

For reference, that isn't the email address, it's the username. Those are old spam accounts that used the same values for email and username.

We no longer allow accounts to have email addresses as their username. Been like that for a few years. Usernames must be lowercase alphanum only.

This ticket was mentioned in Slack in #meta by tellyworth. View the logs.


7 months ago

#4 @tellyworth
7 months ago

Can (should) we handle URLs with user=\w+@ in a special way? Force a 404 or 410, redact the address from the page, something like that? Just in case there are any ancient non-spam addresses in there.

#5 @jonoaldersonwp
7 months ago

Hmm, we should probably avoid trying to do anything clever with the URLs on request, but, we can definitely control indexing of these (types of) URLs, and, separately, I've plans to keep them out of Google Analytics etc by doing some housekeeping in Google Tag Manager before tracking scripts fire.

#6 @jonoaldersonwp
5 months ago

For clarity, this still needs noindex'ing.

#7 @tellyworth
3 days ago

  • Owner set to tellyworth
  • Status changed from new to accepted

What's the solution here? Noindex all Special: pages? Just Contributions and Log? Is it specific to those with @ in the URL?

#8 @jonoaldersonwp
3 days ago

Let's noindex anything starting with https://codex.wordpress.org/Special:Contributions/ - I don't see any useful/valuable (landing) pages in that set.

Note: See TracTickets for help on using tickets.