WordPress.org

Making WordPress.org

Opened 2 months ago

Last modified 5 weeks ago

#5168 new defect

Noindex stale plugin peripherals

Reported by: jonoaldersonwp Owned by:
Milestone: Priority: high
Component: Support Forums Keywords: seo
Cc:

Description

We have rules in place to automatically apply a meta robots _noindex_ directive to 'stale' plugins (those which exceed a recency/support threshold, and those which are intentionally closed).

However, peripheral support content relating to these plugins is _not_ currently noindex'd, which leads to significant bloat. We should make the behaviour of these pages consistent with the 'parent' plugin page, using the same criteria and thresholds.

Example; in the case of this plugin - https://wordpress.org/plugins/wc-multi-tiered-shipping/ - the following (non-exhaustive list of) URLs should also be noindex'd:

Change History (13)

This ticket was mentioned in Slack in #forums by jonoaldersonwp. View the logs.


2 months ago

#2 follow-up: @Clorith
6 weeks ago

We still want support areas to be indexable here though, as users may still be seeking help with their plugins, even if said plugin is old (since the age of a plugin doesn't mean it wont' work after all), no?

So even if we're not actively promoting the plugin it self, usesr looking for an existing answer should be able to find it. (especially given that we use google search internally on w.org, this becomes double important not to exclude content?)

#3 in reply to: ↑ 2 @dd32
6 weeks ago

  • Keywords close added

Replying to Clorith:

We still want support areas to be indexable here though, as users may still be seeking help with their plugins, even if said plugin is old (since the age of a plugin doesn't mean it wont' work after all), no?

So even if we're not actively promoting the plugin it self, usesr looking for an existing answer should be able to find it. (especially given that we use google search internally on w.org, this becomes double important not to exclude content?)

I tend to agree with this, and wish to close this ticket as wontfix based purely upon that. (Although there's also a massive technical reason why this isn't simple to do as well)

#4 @jonoaldersonwp
6 weeks ago

  • Keywords close removed
  • Priority changed from normal to high

Hmm. How about we remove the 'specific thread' example from this, and retain the rest (which are all index/archive pages)?

Strategically, this is quite important; it accounts for a significant amount of bloat on the site, and is impacting all other pages/performance.

#5 @dd32
6 weeks ago

  • Keywords close added

Removing the archives will impact discoverability of said threads, right?

I don't see removing just the archives will help anything, as threads will outnumber archives by a large factor. At best, it'll remove a few plugin archives and review archives.

I could see an argument for deindexing the /unresolved and /active indexes, however, that had already been done prior to this ticket being made, so I'm not sure why you included them.\

#7 @jonoaldersonwp
6 weeks ago

  • Keywords close removed

"Removing the archives will impact discoverability of said threads, right?"
Nope; we're just noindex'ing those specific URLs.

Threads already have noindex'ing logic and conditions. But at the moment, every single plugin we have creates many individual archive URLs (including paginated states), all of which bloat the site significantly.

#8 @dd32
6 weeks ago

Thanks!

I'm going to throw this into the wontfix bucket again though from my perspective.

I don't see deindexing these providing any benefit to the average user, or to the imaginary search budget.

#9 follow-up: @jonoaldersonwp
6 weeks ago

Frankly, I find it more than a little frustrating that you're dismissive of this purely because you don't like (or understand? or care about?) SEO. This is a real issue, whether you like it or not. I've invested time, energy and expertise in identifying it as such, and proposing a solution which I think is suitable.

I'm more than happy to accept that:

  • A fix might be technically complicated (perhaps prohibitively so), and/or;
  • We need to consider issues and side-effects with discoverability of valid content, and/or;
  • We need to consider the impact on internal site search.

However:

  • Our search budget problem isn't imaginary. Google currently has a queue of tens-of-thousands of URLs across the support forum which it hasn't even crawled, and multiples of that which it's crawled and deemed too low value to index.
  • This is a serious problem, which affects the performance of the whole ecosystem.
  • Google being able to effectively crawl and index the site relates directly to the quality of the user experience of 'average users' - the millions of people who search around things which we have content for, as it determines how likely they are to end up on the best page for their query (or if wordpress.org shows up at all).
  • Internal site search is continually a point of contention because of how crap the results are. That's because of issues like this.

I'm asking you to take a leap of faith, and to assume that the work I've put into opening this ticket was significant, considered, and justified.

#10 in reply to: ↑ 9 @dd32
6 weeks ago

Replying to jonoaldersonwp:

Frankly, I find it more than a little frustrating that you're dismissive of this purely because you don't like (or understand? or care about?) SEO.

No, I'm not being dismissive of it because "I don't understand SEO".
I'm being dismissive of it because I believe it provides more benefit to WordPress users, than the detriment that it may be affecting to crawling. #5231 is an example of where I also believe that these changes have gone slightly too far.

#11 @Clorith
6 weeks ago

I'll chime in from the sideline as a middle ground here, that I think there's value in some of this.

As mentioned, I'd like to retain individual topics and I think we may need the archive of the plugins support forum for it to be discoverable as a thing where users can post their question (correct me if I'm wrong on that though).

But there's scenarios where removing any reference would be beneficial, take https://wordpress.org/support/plugin/ask-question/ for example, which is very discoverable to users looking to ask a question about WordPress, the plugin it self has been closed to make it harder to land there by accident, but users still do since the forum exists, and a forum called "ask question" will obviously appeal to the end user.

So is there a middle ground we could hit which accommodates both sides of this?

#12 @jonoaldersonwp
6 weeks ago

Ok, baby steps, then. I propose that we noindex the following (for 'stale' plugins):

  • All paginated states of all archives.
  • Active, Inactive and Unresolved archives.
  • The FAQs page.

That leaves the (first page of) the support archive and the reviews archive indexed. This notably doesn't solve the above problem, but is a step in the right direction insomuch as it reduces the sheer volume of low-value and duplicate pages.

This ticket was mentioned in Slack in #forums by jonoaldersonwp. View the logs.


5 weeks ago

Note: See TracTickets for help on using tickets.