Making WordPress.org

Opened 4 years ago

Closed 4 years ago

#5287 closed defect (bug) (fixed)

Disable WP core XML sitemaps

Reported by: jonoaldersonwp's profile jonoaldersonwp Owned by: otto42's profile Otto42
Milestone: Priority: normal
Component: General Keywords: seo
Cc:

Description

WP 5.5 brings XML sitemaps to WP core. Specifically, the new feature creates sitemaps for public post types (and some secondary content types), and, references those via the robots.txt file.

Unfortunately, due to the complexity of the wp.org ecosystem, the behaviour of the feature 'out of the box' is sub-optimal.

Specifically:

  • The sitemap index at https://wordpress.org/wp-sitemap.xml (and on Rosetta equivalents) in unrepresentative of wordpress.org's content; this will cause errors and issues with Google Search Console (and with Google's crawling/indexing/trust of the site).
  • The site already runs JetPack's XML sitemaps, which do a good-enough job of listing the 'deep' parts of the site which have a particular problem with discoverablity and freshness (plugins, themes, news).

Given this, I suggest that we disable the feature. Ideally, we'd extensively filter and modify the output, but that'd take considerable scoping and resource (which is needed elsewhere).

This can be managed via the wp_sitemaps_is_enabled filter; as described at https://wordpress.org/plugins/core-sitemaps/.

Change History (14)

This ticket was mentioned in Slack in #core by joyously. View the logs.


4 years ago

#2 @desrosj
4 years ago

  • Keywords reporter-feedback added

Are there specific improvements that you would suggest for the current implementation? The description here is pretty vague and doesn't really identify specific problems.

Ideally, improvements should be made to the feature in core so everyone can benefit rather than just disabling it.

#3 @jonoaldersonwp
4 years ago

  • Keywords reporter-feedback removed

The feature is intentionally feature-light, and aims to act as a simple discovery mechanism for simple sites.

WP.org is far from simple, and even scoping and describing the myriad edge-cases and challenges we have would take me many dozens of hours; never mind the resource and effort required to actually resolve them.

It's been the expectation since conception that more complex sites - and in particular those with an active concern over SEO - should use an SEO plugin to handle nuances in sitemap generation. WP.org currently uses JetPack to reasonable effect.

In the long-term, it's our expectation that some/most SEO plugins will adapt their XML sitemap generation to build on top of the new functionality.

Today, right now, we're causing errors and SEO issues by not disabling this.

#4 @ocean90
4 years ago

  • Keywords close added
  • Priority changed from high to normal

We shouldn't have to disable it. wp.org is actually a good platform for testing such new features in the wild.

I'm also pretty sure that every plugin, including Jetpack, will soon either disable the core sitemaps or migrate their features to core's implementation. Either way, no changes should be required right now.


The sitemap index at https://wordpress.org/wp-sitemap.xml (and on Rosetta equivalents) in unrepresentative of wordpress.org's content;

I'm not seeing any differences between Jetpacks and core's sitemap when it comes to the linked content. What issues are you seeing in detail?

#5 @jonoaldersonwp
4 years ago

  • Keywords close removed

wp.org is not by any means a suitable candidate for testing, and not representative of the types of sites which should be relying on core sitemaps.

Right now, we're experiencing errors and warnings in Google Search Console as a result of providing erroneous/incorrect sitemaps. We're providing incorrect signals to search engines about the structure of our content and sites, lowering trust in our sitemaps, and harming our SEO. This needs to be addressed as a matter of urgency.

If we're expecting this feature to be disabled in the near future (by Jetpack), then there's even _less_ reason to disable it now.

Version 0, edited 4 years ago by jonoaldersonwp (next)

#6 @joostdevalk
4 years ago

I agree with @jonoaldersonwp. Please just fix this.

#7 @ocean90
4 years ago

wp.org is not by any means a suitable candidate for testing,

It is, which is why it even runs on trunk.

You still have not answered the question about the actual issues and why they can't be resolved without having to disable everything while another plugin should still be allowed to ship the same sitemaps.

Please just fix this.

🤦‍♂️

#8 @swissspidy
4 years ago

What issues are you seeing in detail?

The one big difference I see is that https://wordpress.org/robots.txt lists all Jetpack sitemaps for the sub sites, and the core one doesn't.

If that is a Jetpack specific feature, then Jetpack would probably disable the core feature.
If it's a custom feature on dotorg, it should probably be adapted accordingly.

#9 @kraftbj
4 years ago

The next version of Jetpack, shipping first week of July, disables the Core sitemap. We wanted to wait until it landed in Core before doing so, and since the filter to disable changed after the feature merge, it wouldn't have mattered if we were earlier. :)

#10 @jonoaldersonwp
4 years ago

wordpress.org is the marketing front of WordPress, and the effectiveness of its SEO impacts our much-lauded % market share. Running code which causes errors and compromises our SEO (on top the myriad other issues we have), and directly harms WordPress' users and potential users - and occludes the many other issues we have.

We don't have time for me to spend days listing out wp.org-specific sitemap issues. Nor does me doing achieve anything, other than to waste my evening, and to reassure you that I'm not making up a problem that doesn't exist. Can we skip that stage, and fast-forward to implementing what should be a trivial fix, for a serious live issue?

Yes, JetPack's sitemaps are also far from perfect, and replicate some of the same issues (largely because of how borky wp.org is) but, having one set of known problems is better than having two sets of problems.

Anecdotally, the core sitemap behaviour on Rosetta sites is particularly broken (e.g., https://fr.wordpress.org/themes/wp-sitemap.xml shouldn't exist, and references URLs which trigger a redirect link), and the Rosetta sites themselves seriously harm our SEO.

It's imperative that we plug this leak and then go about testing/improving (in a non-production environment which isn't tied the success of the whole project).

#11 @kraftbj
4 years ago

I mentioned this in Slack, but I think a mu-plugin to disable this is fine. It would happen anyhow when I update Jetpack in ~10 days.

add_filter( 'wp_sitemaps_is_enabled', '__return_false') ;

I don't have commit to mu-plugins on wp.org to add it, so would need someone else.

#12 @Otto42
4 years ago

  • Owner set to Otto42
  • Status changed from new to accepted

#13 @Otto42
4 years ago

Sitemaps disabled with a mu-plugin in dotorg:16205.

For reference, the filter actually in the core code for 5.5 is wp_sitemaps_enabled, not wp_sitemaps_is_enabled.

#14 @Otto42
4 years ago

  • Resolution set to fixed
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.