Opened 4 years ago
Closed 2 years ago
#5806 closed defect (bug) (fixed)
Remove legacy robots.txt disallow directives
Reported by: | jonoaldersonwp | Owned by: | |
---|---|---|---|
Milestone: | Priority: | low | |
Component: | General | Keywords: | seo |
Cc: |
Description
wp.org + Rosetta site robots.txt
files (such as https://wordpress.org/robots.txt, and https://tah.wordpress.org/robots.txt) prevent crawling of the wp-admin
directory.
These rules should be removed, for the following reasons:
- Many templates (e.g., https://tah.wordpress.org) serve various critical JS resources (e.g., jQuery) from a
wp-admin
path (e.g., https://tah.wordpress.org/wp-admin/load-scripts.php?c=0&load%5Bchunk_0%5D=jquery-core,jquery-migrate,wp-embed&ver=5.9-alpha-51321).*- This causes SEO issues, as Google is unable to render the page in an equivalent manner to a human user (see https://yoast.com/dont-block-css-and-js-files/, https://moz.com/blog/why-all-seos-should-unblock-js-css).
- They're redundant, as responses from requests to (protected pages in)
wp-admin
paths always include a meta robotsnoindex
directive (see https://yoast.com/wordpress-robots-txt-example/).
*A separate ticket will alter where these resources are loaded from and how they're served.
Changes
- On Rosetta sites'
robots.txt
files (e.g., https://fr.wordpress.org/robots.txt), remove the following lines:Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
- On https://wordpress.org/robots.txt, remove only the following lines:
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Change History (3)
Note: See
TracTickets for help on using
tickets.
These are the current core robots.txt directives, please follow up with a Core ticket if these are not appropriate for all WordPress sites.
I'll take a middle-ground approach here and specifically allow access to
/wp-admin/load-*.php
because I do know that some WordPress.org themes trigger some WordPress functionality that other sites wouldn't on the front-end.