Making WordPress.org

Opened 15 months ago

Last modified 15 months ago

#4685 new defect

Disallow /changeset in https://core.trac.wordpress.org/robots.txt

Reported by: jonoaldersonwp Owned by:
Milestone: Priority: lowest
Component: Trac Keywords: seo


Add a disallow rule for /changeset.

Change History (6)

#1 @nacin
15 months ago

Why would we block changesets from search? They're very valuable.

#2 @jonoaldersonwp
15 months ago

Because there are ~13,000 low-quality, undifferentiated, unoptimised pages indexed, sucking value out of the rest of the site.

If somebody's willing to invest radical improvements in the trac site to bring it up to standards, we could get away with having these indexed. In their current state, they're a drain.

Are you suggesting that people search/Google for specific queries, and expect/find these pages? Can you give me some example queries and scenarios?

#3 @dd32
15 months ago

I agree, /changeset URLs are useful and valuable. Lets not block them.

There's potentially a better request here, which is to noindex the specific files urls on it, such as /changeset/\d+/trunk/..... which are less useful in search results.

#4 @jonoaldersonwp
15 months ago

Ok, let's alter the approach to add rules for requests containing:

  • *old_path=
  • /trunk
  • /branches

#5 @dd32
15 months ago

So that'd be this then?

  • /changeset/*/old_path=
  • /changeset/*/trunk/
  • /changeset/*/branches/
  • /changeset/*/tag/

Also just noting that currently the robots.txt here is shared with every Trac and SVN instance, and requires a systems request to alter.

#6 @jonoaldersonwp
15 months ago

I'm wondering if it'd be easier just to roll these rules out to all Trac/SVN sites.

The 'old path' rule should be /changeset/*old_path=*.

Note: See TracTickets for help on using tickets.