Making WordPress.org

Opened 5 years ago

Closed 5 years ago

Last modified 4 years ago

#4753 closed enhancement (fixed)

Obfuscate profile links

Reported by: jonoaldersonwp's profile jonoaldersonwp Owned by:
Milestone: Priority: high
Component: Profiles Keywords: seo
Cc:

Description

Changing the way in which profile links output is part of broader approach to 'solving' spam profiles on wordpress.org. See https://www.jonoalderson.com/wordpress/wordpress-org-toxic-profile-spam/ for long-form research and rationale.

This part of the solution requires us to:

Change History (9)

#1 @Ipstenu
5 years ago

Would it be feasible to only disallow profiles that are blocked or flagged? We have the ability to mark an account as blocked, and logically that's the most common spammy profile we'll see.

Your example of profiles.wordpress.org/phathaiophunhuan/ is a banned account, so logically that should show ... nothing?

This ticket was mentioned in Slack in #forums by joyously. View the logs.


5 years ago

#3 @jonoaldersonwp
5 years ago

No, crawl prevention can only be handled via robots.txt, and it'd be impractical (and technically challenging) to adapt that dynamically based on bad profiles. I'm also not if/how sure this would be advantageous for us.

RE: banned accounts; as per my post, I expected some of the examples I highlighted to have been 'fixed' since I discovered them. See also, #4632

Last edited 4 years ago by SergeyBiryukov (previous) (diff)

#4 @dd32
5 years ago

I personally disagree with using something like /out/ as a redirect, however acknowledge that it's a easy way to deal with the years of spam links that we've accumulated that we're otherwise unable to detect.

As part of the other profiles changes, I've hidden the URL for banned accounts which helps a little bit.

Would using https://profiles.wordpress.org/out-redirect/$user suffice here though? AFAICT it doesn't need to be signed, or otherwise secret as long as it's within a directory that's not-a-user (such as the user 'out') and that can be blocked in the robots.txt?

#5 @jonoaldersonwp
5 years ago

Yep! :)

That example is fine - we're good as long as we're:

  • Obfuscating the URL so that automated spam tools can't (as easily) detect that their link is on the page.
  • Using a structure which we can disallow search engines from following in the robots.txt file

We could even simplify to something like /?redirect_profile_link.php?user=$user, if that's preferable.

#6 @dd32
5 years ago

  • Resolution set to fixed
  • Status changed from new to closed

Fixed in r15376-dotorg.

#7 @dd32
5 years ago

$ curl -s https://profiles.wordpress.org/dd32/ | grep 'dd32.id.au'
Website: <strong><a href="https://profiles.wordpress.org/website-redirect/dd32" title="https://dd32.id.au/" rel="nofollow">dd32.id.au</a></strong>

$ curl -Is --referer https://profiles.wordpress.org/dd32/ https://profiles.wordpress.org/website-redirect/dd32 | grep location
location: https://dd32.id.au/

$ curl https://profiles.wordpress.org/robots.txt
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Disallow: /website-redirect/

#8 @jonoaldersonwp
5 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Could you remove the title attribute from the link, too, please?

#9 @dd32
5 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed
Note: See TracTickets for help on using tickets.