Making WordPress.org

Opened 17 months ago

Closed 17 months ago

Last modified 17 months ago

#6413 closed defect (bug) (fixed)

Robots.txt blocking Search Engines from getting plugin descrpition

Reported by: digamberpradhan's profile digamberpradhan Owned by: dd32's profile dd32
Milestone: Priority: high
Component: Plugin Directory Keywords: needs-patch
Cc:

Description (last modified by dd32)

At the time of opening this ticket - July 19th 2022,
Our plugin Search with Typesense http://wordpress.org/plugins/search-with-typesense/
is being blocked for search engines.
That is when you type "Search with Typesense"
You will be shown "No information is available for this page" under the description of google search result - this is because the wordpress.org robots.txt is blocking all pages/urls with plugins/search

https://wordpress.org/robots.txt

Other pages/plugins suffering the same fate:
https://wordpress.org/plugins/search/search/
https://wordpress.org/plugins/search-exclude/
https://wordpress.org/plugins/search-in-place/
To name a few

Attachments (1)

Screen Shot 2022-07-19 at 21.22.11.png (67.4 KB) - added by digamberpradhan 17 months ago.
Screenshot of Google Search Listing showing No information available

Download all attachments as: .zip

Change History (5)

@digamberpradhan
17 months ago

Screenshot of Google Search Listing showing No information available

This ticket was mentioned in Slack in #meta by digamber. View the logs.


17 months ago

#2 @dd32
17 months ago

  • Description modified (diff)
  • Owner set to dd32
  • Status changed from new to accepted

#3 @dd32
17 months ago

  • Resolution set to fixed
  • Status changed from accepted to closed

In 11979:

WordPress.org Robots.txt: Correctly limit /plugins/search/* crawl limiting.

In [11722] I added blocking for the plugins/search endpoint, however, I did so by not including a trailing slash.
The result of this is that the rule has blocked access to plugins whose slugs begin with search (i.e. /plugins/search-*/).

By adding a trailing slash here, it correctly applies the Disallow rule to /plugins/search/* instead.

Props digamberpradhan for noticing/reporting the issue.
See #5323.
Fixes #6413.

#4 @dd32
17 months ago

Noting I've refreshed googles cache of robots.txt and requested re-indexing for any matching plugins updated in the last 3 months (It's a manual process, so the other ~150 older plugins will just have to wait for the regularly scheduled google crawl - edit: although the older plugins have noindex in their robots tag anyway)

Last edited 17 months ago by dd32 (previous) (diff)
Note: See TracTickets for help on using tickets.