Opened 22 months ago
Last modified 22 months ago
#6766 new enhancement
Update developer.wordpress.org/robots.txt to prevent spam attacks
Reported by: | jonoaldersonwp | Owned by: | |
---|---|---|---|
Milestone: | Priority: | low | |
Component: | General | Keywords: | seo performance has-patch changes-requested |
Cc: |
Description
The developer site is the subject of internal site search spam attacks.
This impacts our crawl budget, and floods our Google Search Console account (potentially blinding us to other issues).
We can reduce the impact of this by tweaking the site's robots.txt rules as follows, to block search patterns (and add some best practices whilst we're there).
# Prevent crawling of WP internals # -------------------------------- User-agent: * Disallow: /wp-admin/ Disallow: /?rest_route= Disallow: /xmlrpc.php # Prevent crawling of search URLs # -------------------------------- Disallow: /?s= Disallow: /search/
Change History (3)
This ticket was mentioned in PR #121 on WordPress/wordpress.org by @tellyworth.
22 months ago
#1
- Keywords has-patch added
#2
@
22 months ago
- Keywords changes-requested added
@jonoaldersonwp: @dd32 pointed out in code review that part of this contradicts an earlier ticket #5806
https://github.com/WordPress/wordpress.org/pull/121#discussion_r1109379789
How should that be resolved? Can this be simplified to only include the search URLs?
See https://meta.trac.wordpress.org/ticket/6766
This adds new rules to both https://wordpress.org/robots.txt and https://developer.wordpress.org/robots.txt.
Before, main site:
After, main site:
Before, developer:
After, developer: