Making WordPress.org

Opened 6 months ago

Last modified 6 months ago

#7672 new defect (bug)

Alter sensitive data replacement mechanism

Reported by: jonoaldersonwp's profile jonoaldersonwp Owned by:
Milestone: Priority: normal
Component: Support Forums Keywords: seo
Cc:

Description (last modified by jonoaldersonwp)

When users submit forum threads with sensitive data (system filepaths, etc), I believe that we automatically detect and replace those strings with 'xxx' and similar.

E.g.,

This has the unfortunate side-effect of making the wordpress.org rank highly in Google for variations of 'xxx'; in the last month we got ~80,000 clicks for such terms.

This is problematic, as there's an obvious mismatch in intent, which might lead Google to believe that our site provides a poor user experience, which could negatively impact performance domain-wide.

To address this, I suggest that we:

  1. Entirely remove (rather than replace) sensitive string in URLs
  2. Replace sensitive strings in titles and body with [REDACTED] (rather than xxx and similar)
  3. Retrospectively apply 2 (but not 1, to avoid breaking URLs) to existing forum threads.

Change History (5)

#1 @jonoaldersonwp
6 months ago

  • Description modified (diff)

#2 @jonoaldersonwp
6 months ago

  • Description modified (diff)

#3 @ocean90
6 months ago

  • Component changed from International Forums to Support Forums

The x is just a typical replacement character used by the users.

#4 @dd32
6 months ago

Yeah, this isn't something WordPress.org does.

Most forum mods use [redacted] or removed for privacy.

I'm not sure there's anything really to do here, without proactively changing all instances of xxx in all user-generated-content to be abc or with multiplication signs.. neither of which seems appropriate to me.

Perhaps we could look at adding sensitive data redaction which replaces /home/[^/]+/ with ellipse for logged out uses (and emails, and maybe even some URLs). We could also look at not having -xxx- in slugs.

#5 @jonoaldersonwp
6 months ago

Darn, that indeed makes things trickier!

Matching a few of those common patterns (/home/[^/]+/, etc) would definitely help, and would reduce the surface area considerably.

Note: See TracTickets for help on using tickets.