#5938 closed defect (bug) (fixed)
Add an x-robots-tag header to md5/sha1 etc file URLs
| Reported by: |
|
Owned by: | |
|---|---|---|---|
| Milestone: | Priority: | lowest | |
| Component: | WordPress.org Site | Keywords: | seo |
| Cc: |
Description
We expose lots of URLs like this across wordpress.org:
https://wordpress.org/wordpress-5.6.3.zip.sha1
https://wordpress.org/wordpress-4.0.5.tar.gz.md5
https://br.wordpress.org/wordpress-4.7-pt_BR.zip.md5
These consume considerable crawl resources and often tripS the 'soft 404' warning in Google Search Console.
We should manage this by adding an x-robots-tag header to all responses ending in .sha1, .md5 and similar, with a value of noindex, follow.
Change History (4)
#2
@
4 years ago
Ah, good question!
We can indeed safely ignore zip and gz links.
Yes please for downloads.!
#3
@
4 years ago
- Component changed from General to WordPress.org Site
- Resolution set to fixed
- Status changed from new to closed
Added.
$ curl -Is https://wordpress.org/wordpress-5.6.3.zip.sha1 | grep -i 'x-robots-tag' x-robots-tag: noindex, follow $ curl -Is https://wordpress.org/wordpress-5.6.3.zip | grep -i 'x-robots-tag' // No output $ curl -Is https://downloads.wordpress.org/release/en_AU/latest.zip.sha1 | grep -i 'x-robots-tag' x-robots-tag: noindex, follow $ curl -Is https://br.wordpress.org/wordpress-4.7-pt_BR.zip.md5 | grep -i 'x-robots-tag' x-robots-tag: noindex, follow
Plugins have a .json checksum file that I haven't added it to, but they're served with the proper content-type headers and aren't linked to.. so I think those should be fine?
Note: See
TracTickets for help on using
tickets.
This should not affect
.zip$or.gz$links correct?What about
https://downloads.wordpress.org/*links? Same as the above?