#5184 closed defect (bug) (reported-upstream)
Homepage requests with a 'page' parameter should return a 404
Reported by: | jonoaldersonwp | Owned by: | |
---|---|---|---|
Milestone: | Priority: | lowest | |
Component: | General | Keywords: | seo |
Cc: |
Description (last modified by )
Requests like https://wordpress.org/page/3/ should return a 404 template and HTTP header.
Requests to paginated states of /download/, like https://wordpress.org/download/6/, should return a 404 template and HTTP header.
Requests to paginated states of pages in (and including) the 'about' section, such as https://en-gb.wordpress.org/about/features/5/, https://en-gb.wordpress.org/about/5/ and https://wordpress.org/about/license/8/ should return a 404 template and HTTP header
Change History (16)
#2
@
4 years ago
- Description modified (diff)
Closing the others as duplicates of this, as they're all Paginated states of Pages which is the same thing at the core.
#5
follow-up:
↓ 6
@
4 years ago
The last two examples should be fixed by [WP47727].
#6
in reply to:
↑ 5
@
4 years ago
Replying to ocean90:
The last two examples should be fixed by [WP47727].
Ah, so they are, Thanks @SergeyBiryukov!
#7
in reply to:
↑ 1
;
follow-up:
↓ 8
@
4 years ago
Replying to dd32:
Would it returning a
canonical
tag ofhttps://wordpress.org/
suffice here? (Currently it returns<link rel="canonical" href="https://wordpress.org/3/" />
)
Just noting that core should really be returning a canonical of either https://wordpress.org/page/3/
or https://wordpress.org/
here - https://wordpress.org/3/
is just plain wrong. This specific canonical issue only happens on the homepage, and there is an open core ticket for this specific issue: https://core.trac.wordpress.org/ticket/49220
For wordpress.org specifically, the canonical should be equal to https://wordpress.org/
#8
in reply to:
↑ 7
@
4 years ago
Replying to bradleyt:
Replying to dd32:
Would it returning a
canonical
tag ofhttps://wordpress.org/
suffice here? (Currently it returns<link rel="canonical" href="https://wordpress.org/3/" />
)
...
For wordpress.org specifically, the canonical should be equal tohttps://wordpress.org/
Would returning that canonical tag fulfil the needs of this ticket, specifically, can we avoid having to return a 301 or 404 here and just use the canonical tag instead?
#9
follow-up:
↓ 10
@
4 years ago
A canonical tag would definitely help, but we'd still be in a position where we have infinite crawl traps and pages which should exist. That'd continue to impact crawl budget, discovery, etc, across the site(s).
#10
in reply to:
↑ 9
@
4 years ago
Replying to jonoaldersonwp:
we'd still be in a position where we have infinite crawl traps and pages which should exist. That'd continue to impact crawl budget, discovery, etc, across the site(s).
As paginated states of the front-page aren't ever actually linked, I'm not sure if that's realistically an issue here? 3rd party websites may link to one or two such pages, but on the whole it shouldn't be massive traffic?
#11
@
4 years ago
The problem isn't traffic volume, it's that they're queryable and public. That means they'll still represent a point of leakage. That aside, they shouldn't exist / be exposed, regardless.
#12
@
4 years ago
- Resolution set to fixed
- Status changed from new to closed
Returns a canonical tag now.
I'm not inclined to add a redirect here right now.
All other urls mentioned redirect thanks to [WP47727].
#13
@
4 years ago
- Priority changed from normal to lowest
- Resolution fixed deleted
- Status changed from closed to reopened
This is a huge improvement, but we still need to improve the handling of invalid requests to optimize crawl budget.
As per the brief, URLs like https://wordpress.org/page/3/ and (and https://ja.wordpress.org/page/13/?field_department_tid=All&order=title&sort=asc&qt-product_tos_download_new=1&15fca476=707a3199&version=subscription&Tablet&gclid=EAIaIQobChMI_KPyi8DG3QIViOFRCh39PgwtEAAYASABEgK7OPD_BwE&type=navmenu&mkt_tok=eyJpIjoiTlRnMVpEY3lZMlV4WlRneSIsInQiOiJzUHQwMjlPVTFFU1diWHA0dW1jVVR1SG5TREpUSElWR0lzMDRoQmJvdFVHSjlmQWg4SUprXC9Cd3E4MzhGdDdMMlwvQ2F3K3J6QmUrMDBFQ3R0UzlLSm9FQ2pzZ1wvMk16VnVpdkhNTmthSE1vS2JDVEhocU55M2RUT0tQZUNQbDZwQSJ9, from https://meta.trac.wordpress.org/ticket/5169) need to return a 404 or 301.
Prefer a 404, as these URLs might feasibly be valid in the future.
#15
@
4 years ago
- Resolution set to reported-upstream
- Status changed from reopened to closed
Opened https://core.trac.wordpress.org/ticket/50163 with a possible patch.
Going to mark this as it can be handled upstream.
Would it returning a
canonical
tag ofhttps://wordpress.org/
suffice here? (Currently it returns<link rel="canonical" href="https://wordpress.org/3/" />
)