Making WordPress.org

Opened 4 months ago

Last modified 4 months ago

#7773 new enhancement

Lowercase all tags

Reported by: dd32's profile dd32 Owned by:
Milestone: Priority: normal
Component: Plugin Directory Keywords: 2nd-opinion
Cc:

Description

It's been pointed out that some plugins have uppercase tags (ABC) or proper cased (Name).

The majority (82%) of tags are however lowercase: abc, name.

Given these are tags, they should be consistently lower-case IMHO.

Change History (6)

#1 follow-up: @dd32
4 months ago

Additionally, we should consider whether a tag of register widgets for use on your site is a valid tag, I don't think it is, I personally think we should limit tags to maximum of 2 words.

Looking at existing tags, for a data-driven decision..

Word count# of tagsPercent of tags (running count)Largest Install count plugin
1 word 265,39869.34% (69.34%)5m+
2 words 87,39322.83% (92.18%)5m+
3 words 23,4606.13% (98.31%)5m+
4 words 4,6851.22% (99.53%)800k+
5 words 1,1800.31% (99.84%)100k+
6 words 3810.1% (99.94%)40k+
7 words 1150.03% (99.97%)30k+
8 words 430.01% (99.98%)40k+
9 words 220.01% (99.99%)300+
10+ words 490.01% (100%)70+

Based on the data, anything 4+ words in a tag is the edge-case in length.

But the install count suggests that 5 words in a tag is reasonable, which seems weird.
Here's one of the affected tags: https://wordpress.org/plugins/tags/contact-form-7-invisible-recaptcha/ - IMHO that's not a "tag" that's two tags: contact-form-7 and invisible-recaptcha.

Example of a 4-word tag: https://wordpress.org/plugins/tags/contact-form-7-db/ https://wordpress.org/plugins/tags/increase-file-size-limit/

I suspect the reason we're seeing lots of plugins with multiple-word tags is SEO of the index, as we include tags as a search keyword, and we limit plugins to 5 tags.

Last edited 4 months ago by dd32 (previous) (diff)

#2 @dd32
4 months ago

  • Keywords 2nd-opinion added

#3 @anonymized_14808221
4 months ago

I’ve never even considered a tag could be more than one word, but that probably stems from:

  • html (or else) tag is a one-word thing
  • (hash) tags on social media won’t work with spaces
  • “a” tag for me is a single thing - even if I’ve to tag a post as “computer programming” I’ll add a dash just because it feels more “tag-like”
  • for some reason (probably again due to above first two points) they are all lower case in my mind

So I can’t but agree tags should be single worded lowercase search tags, not long tail SEO honeypots.

#4 @knutsp
4 months ago

Tags are words of natural language, and should be allowed to be capatialized as such, as from the input.

I can see the need for a quite strict word limit, even length, and only one or two uppercase letter(s) per word.

As a programmer I can sometimes view tags as similar to array keys, but only as long as I am in that mode.

#5 in reply to: ↑ 1 @dd32
4 months ago

Replying to dd32:

Looking at existing tags, for a data-driven decision..

I was asked for the data-source, and after talking through it, realised I queried data that might not be representative of the plugin directory today.

The data I queried was all tags, where as lots of closed plugins would've had more spammy tags.
(edit: And tags that might not have been in use today)

Here's that data again, but this time, only for currently published plugins where the tag is used by more than 1 plugin (because we hide tags used only by a singular plugin)

Word count# of tagsPercent of tags (running count)Largest Install count plugin
1 157,14274.5% (74.5%)10m+
2 44,51121.1% (95.6%)10m+
3 8,2363.9% (99.5%)6m+
4 8750.41% (99.92%)600k+
5 1380.07% (99.98%)100k+
6 340.02% (100%)30k+
7 40% (100%)700+
8 20% (100%)60+

It's not a huge difference, but does confirm that once you pass 2 words you're into the long-tail, and that 3 words is 99.5% of tags (where as before that was 4 words was 99.5%).

Last edited 4 months ago by dd32 (previous) (diff)

#6 @JavierCasares
4 months ago

+1 to lowercase everything (tags are used for search / filter, so there is no need to have them in other way).

And, about the use of more than 1 word, I use for example “site health”. I understand that maybe we can replace “ ” with “-” (so it may be “site-health” to be consistent.

Note: See TracTickets for help on using tickets.