Making WordPress.org

Opened 2 years ago

Last modified 22 months ago

#6768 new defect (bug)

Investigate broken 'jobs page' URLs

Reported by: jonoaldersonwp's profile jonoaldersonwp Owned by:
Milestone: Priority: low
Component: WordPress.org Site Keywords: seo
Cc:

Description

I'm seeing a ton of errors in Google Search Console where URLs like https://fr-ca.wordpress.org/?p=job/oV6V8fwO/apply&nl=1 and https://eu.wordpress.org/?p=job%2Fo67D8fwI&nl=1 are returning a 400 status (and, as you'll see, an obviously broken template).

Do we know what these are / where/why they're being exposed?

The best way to handle this may depend on whether these are (or were) legitimate pages/requests, or, whether this is just some back end weirdness being accidentally exposed.

Insight appreciated!

Change History (7)

#1 @dd32
2 years ago

These are not legitimate URLs.

A request for ?p=... where p is not a numeric value (which is the only value acceptable in a WordPress context) the request returns 400 Bad Request.

Nothing on WordPress.org, or any WordPress.org-related domain should be linking to those.

#2 @jonoaldersonwp
2 years ago

Super, in which case, can we just make these 404 (template and HTTP header)?

#3 @dd32
2 years ago

What difference does 400 vs 404 make? in real terms, for URLs that should never be linked to, and are not expected to work.

Unfortunately the 404 template (and the 400 template as you saw) is pretty broken, as it doesn't load the WordPress themes to allow for it.

If someone would like to look into it, public_html/403.php & public_html/404.php are static files that probably should be updated with the new templates.

#4 @jonoaldersonwp
2 years ago

400s get reported as site errors in GSC, and thus gum up our reporting. It's also bad practice for errors to 'exist' indefinitely in a non-fixed state

Those URLs are getting linked to from 'somewhere', and once they're known to exist, they'll get polled (and explored) forever. We can't ever assume or rely on a URL being 'hidden' as a solution.

#5 @dd32
2 years ago

  • Component changed from General to WordPress.org Site
  • Priority changed from low to lowest

ton of errors in Google Search Console

Worth noting that these account for ~400 of the ~600 "Other 4xx errors" recorded in Google Search console.

A tiny amount compared to the number of pages indexed (~1.1Million) or excluded (~3.5million by norobots, 250k by redirects, etc).

I'll take a look into changing that status - probably replacing it with a redirect.

#6 @jonoaldersonwp
2 years ago

A redirect to where? If we don't have a suitable destination, we shouldn't bulk redirect invalid URLs to the homepage; they should just return a 404 status+template.

#7 @jonoaldersonwp
22 months ago

  • Priority changed from lowest to low
Note: See TracTickets for help on using tickets.