Making WordPress.org

Opened 21 months ago

Closed 21 months ago

Last modified 19 months ago

#4753 closed enhancement (fixed)

Obfuscate profile links

Reported by: jonoaldersonwp Owned by:
Milestone: Priority: high
Component: Profiles Keywords: seo


Changing the way in which profile links output is part of broader approach to 'solving' spam profiles on wordpress.org. See https://www.jonoalderson.com/wordpress/wordpress-org-toxic-profile-spam/ for long-form research and rationale.

This part of the solution requires us to:

Change History (9)

#1 @Ipstenu
21 months ago

Would it be feasible to only disallow profiles that are blocked or flagged? We have the ability to mark an account as blocked, and logically that's the most common spammy profile we'll see.

Your example of profiles.wordpress.org/phathaiophunhuan/ is a banned account, so logically that should show ... nothing?

This ticket was mentioned in Slack in #forums by joyously. View the logs.

21 months ago

#3 @jonoaldersonwp
21 months ago

No, crawl prevention can only be handled via robots.txt, and it'd be impractical (and technically challenging) to adapt that dynamically based on bad profiles. I'm also not if/how sure this would be advantageous for us.

RE: banned accounts; as per my post, I expected some of the examples I highlighted to have been 'fixed' since I discovered them. See also, #4632

Last edited 19 months ago by SergeyBiryukov (previous) (diff)

#4 @dd32
21 months ago

I personally disagree with using something like /out/ as a redirect, however acknowledge that it's a easy way to deal with the years of spam links that we've accumulated that we're otherwise unable to detect.

As part of the other profiles changes, I've hidden the URL for banned accounts which helps a little bit.

Would using https://profiles.wordpress.org/out-redirect/$user suffice here though? AFAICT it doesn't need to be signed, or otherwise secret as long as it's within a directory that's not-a-user (such as the user 'out') and that can be blocked in the robots.txt?

#5 @jonoaldersonwp
21 months ago

Yep! :)

That example is fine - we're good as long as we're:

  • Obfuscating the URL so that automated spam tools can't (as easily) detect that their link is on the page.
  • Using a structure which we can disallow search engines from following in the robots.txt file

We could even simplify to something like /?redirect_profile_link.php?user=$user, if that's preferable.

#6 @dd32
21 months ago

  • Resolution set to fixed
  • Status changed from new to closed

Fixed in r15376-dotorg.

#7 @dd32
21 months ago

$ curl -s https://profiles.wordpress.org/dd32/ | grep 'dd32.id.au'
Website: <strong><a href="https://profiles.wordpress.org/website-redirect/dd32" title="https://dd32.id.au/" rel="nofollow">dd32.id.au</a></strong>

$ curl -Is --referer https://profiles.wordpress.org/dd32/ https://profiles.wordpress.org/website-redirect/dd32 | grep location
location: https://dd32.id.au/

$ curl https://profiles.wordpress.org/robots.txt
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Disallow: /website-redirect/

#8 @jonoaldersonwp
21 months ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Could you remove the title attribute from the link, too, please?

#9 @dd32
21 months ago

  • Resolution set to fixed
  • Status changed from reopened to closed
Note: See TracTickets for help on using tickets.