Making WordPress.org

Opened 8 months ago

Last modified 7 months ago

#7773 new enhancement

Lowercase all tags

Reported by: dd32's profile dd32 Owned by:
Milestone: Priority: normal
Component: Plugin Directory Keywords: 2nd-opinion
Cc:

Description

It's been pointed out that some plugins have uppercase tags (ABC) or proper cased (Name).

The majority (82%) of tags are however lowercase: abc, name.

Given these are tags, they should be consistently lower-case IMHO.

Change History (6)

#1 follow-up: @dd32
8 months ago

Additionally, we should consider whether a tag of register widgets for use on your site is a valid tag, I don't think it is, I personally think we should limit tags to maximum of 2 words.

Looking at existing tags, for a data-driven decision..

Word count# of tagsPercent of tags (running count)Largest Install count plugin
1 word 265,39869.34% (69.34%)5m+
2 words 87,39322.83% (92.18%)5m+
3 words 23,4606.13% (98.31%)5m+
4 words 4,6851.22% (99.53%)800k+
5 words 1,1800.31% (99.84%)100k+
6 words 3810.1% (99.94%)40k+
7 words 1150.03% (99.97%)30k+
8 words 430.01% (99.98%)40k+
9 words 220.01% (99.99%)300+
10+ words 490.01% (100%)70+

Based on the data, anything 4+ words in a tag is the edge-case in length.

But the install count suggests that 5 words in a tag is reasonable, which seems weird.
Here's one of the affected tags: https://wordpress.org/plugins/tags/contact-form-7-invisible-recaptcha/ - IMHO that's not a "tag" that's two tags: contact-form-7 and invisible-recaptcha.

Example of a 4-word tag: https://wordpress.org/plugins/tags/contact-form-7-db/ https://wordpress.org/plugins/tags/increase-file-size-limit/

I suspect the reason we're seeing lots of plugins with multiple-word tags is SEO of the index, as we include tags as a search keyword, and we limit plugins to 5 tags.

Last edited 8 months ago by dd32 (previous) (diff)

#2 @dd32
8 months ago

  • Keywords 2nd-opinion added

#3 @anonymized_14808221
8 months ago

I’ve never even considered a tag could be more than one word, but that probably stems from:

  • html (or else) tag is a one-word thing
  • (hash) tags on social media won’t work with spaces
  • “a” tag for me is a single thing - even if I’ve to tag a post as “computer programming” I’ll add a dash just because it feels more “tag-like”
  • for some reason (probably again due to above first two points) they are all lower case in my mind

So I can’t but agree tags should be single worded lowercase search tags, not long tail SEO honeypots.

#4 @knutsp
8 months ago

Tags are words of natural language, and should be allowed to be capatialized as such, as from the input.

I can see the need for a quite strict word limit, even length, and only one or two uppercase letter(s) per word.

As a programmer I can sometimes view tags as similar to array keys, but only as long as I am in that mode.

#5 in reply to: ↑ 1 @dd32
7 months ago

Replying to dd32:

Looking at existing tags, for a data-driven decision..

I was asked for the data-source, and after talking through it, realised I queried data that might not be representative of the plugin directory today.

The data I queried was all tags, where as lots of closed plugins would've had more spammy tags.

Here's that data again, but this time, only for currently published plugins where the tag is used by more than 1 plugin (because we hide tags used only by a singular plugin)

Word count# of tagsPercent of tags (running count)Largest Install count plugin
1 157,14274.5% (74.5%)10m+
2 44,51121.1% (95.6%)10m+
3 8,2363.9% (99.5%)6m+
4 8750.41% (99.92%)600k+
5 1380.07% (99.98%)100k+
6 340.02% (100%)30k+
7 40% (100%)700+
8 20% (100%)60+

It's not a huge difference, but does confirm that once you pass 2 words you're into the long-tail, and that 3 words is 99.5% of tags (where as before that was 4 words was 99.5%).

Version 0, edited 7 months ago by dd32 (next)

#6 @JavierCasares
7 months ago

+1 to lowercase everything (tags are used for search / filter, so there is no need to have them in other way).

And, about the use of more than 1 word, I use for example “site health”. I understand that maybe we can replace “ ” with “-” (so it may be “site-health” to be consistent.

Note: See TracTickets for help on using tickets.