#5938 closed defect (bug) (fixed)
Add an x-robots-tag header to md5/sha1 etc file URLs
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Priority: | lowest | |
Component: | WordPress.org Site | Keywords: | seo |
Cc: |
Description
We expose lots of URLs like this across wordpress.org:
https://wordpress.org/wordpress-5.6.3.zip.sha1
https://wordpress.org/wordpress-4.0.5.tar.gz.md5
https://br.wordpress.org/wordpress-4.7-pt_BR.zip.md5
These consume considerable crawl resources and often tripS the 'soft 404' warning in Google Search Console.
We should manage this by adding an x-robots-tag
header to all responses ending in .sha1
, .md5
and similar, with a value of noindex, follow
.
Change History (4)
#2
@
3 years ago
Ah, good question!
We can indeed safely ignore zip
and gz
links.
Yes please for downloads.
!
#3
@
3 years ago
- Component changed from General to WordPress.org Site
- Resolution set to fixed
- Status changed from new to closed
Added.
$ curl -Is https://wordpress.org/wordpress-5.6.3.zip.sha1 | grep -i 'x-robots-tag' x-robots-tag: noindex, follow $ curl -Is https://wordpress.org/wordpress-5.6.3.zip | grep -i 'x-robots-tag' // No output $ curl -Is https://downloads.wordpress.org/release/en_AU/latest.zip.sha1 | grep -i 'x-robots-tag' x-robots-tag: noindex, follow $ curl -Is https://br.wordpress.org/wordpress-4.7-pt_BR.zip.md5 | grep -i 'x-robots-tag' x-robots-tag: noindex, follow
Plugins have a .json
checksum file that I haven't added it to, but they're served with the proper content-type headers and aren't linked to.. so I think those should be fine?
Note: See
TracTickets for help on using
tickets.
This should not affect
.zip$
or.gz$
links correct?What about
https://downloads.wordpress.org/*
links? Same as the above?