Making WordPress.org

Opened 7 months ago

Last modified 5 months ago

#6768 new defect (bug)

Investigate broken 'jobs page' URLs

Reported by: jonoaldersonwp's profile jonoaldersonwp Owned by:
Milestone: Priority: low
Component: WordPress.org Site Keywords: seo
Cc:

Description

I'm seeing a ton of errors in Google Search Console where URLs like https://fr-ca.wordpress.org/?p=job/oV6V8fwO/apply&nl=1 and https://eu.wordpress.org/?p=job%2Fo67D8fwI&nl=1 are returning a 400 status (and, as you'll see, an obviously broken template).

Do we know what these are / where/why they're being exposed?

The best way to handle this may depend on whether these are (or were) legitimate pages/requests, or, whether this is just some back end weirdness being accidentally exposed.

Insight appreciated!

Change History (7)

#1 @dd32
7 months ago

These are not legitimate URLs.

A request for ?p=... where p is not a numeric value (which is the only value acceptable in a WordPress context) the request returns 400 Bad Request.

Nothing on WordPress.org, or any WordPress.org-related domain should be linking to those.

#2 @jonoaldersonwp
7 months ago

Super, in which case, can we just make these 404 (template and HTTP header)?

#3 @dd32
7 months ago

What difference does 400 vs 404 make? in real terms, for URLs that should never be linked to, and are not expected to work.

Unfortunately the 404 template (and the 400 template as you saw) is pretty broken, as it doesn't load the WordPress themes to allow for it.

If someone would like to look into it, public_html/403.php & public_html/404.php are static files that probably should be updated with the new templates.

#4 @jonoaldersonwp
7 months ago

400s get reported as site errors in GSC, and thus gum up our reporting. It's also bad practice for errors to 'exist' indefinitely in a non-fixed state

Those URLs are getting linked to from 'somewhere', and once they're known to exist, they'll get polled (and explored) forever. We can't ever assume or rely on a URL being 'hidden' as a solution.

#5 @dd32
7 months ago

  • Component changed from General to WordPress.org Site
  • Priority changed from low to lowest

ton of errors in Google Search Console

Worth noting that these account for ~400 of the ~600 "Other 4xx errors" recorded in Google Search console.

A tiny amount compared to the number of pages indexed (~1.1Million) or excluded (~3.5million by norobots, 250k by redirects, etc).

I'll take a look into changing that status - probably replacing it with a redirect.

#6 @jonoaldersonwp
7 months ago

A redirect to where? If we don't have a suitable destination, we shouldn't bulk redirect invalid URLs to the homepage; they should just return a 404 status+template.

#7 @jonoaldersonwp
5 months ago

  • Priority changed from lowest to low
Note: See TracTickets for help on using tickets.