WordPress.org

Making WordPress.org

Opened 6 weeks ago

Closed 6 weeks ago

#5228 closed enhancement (maybelater)

Consistency tool: create wildcard search option

Reported by: felipeloureirosantos Owned by:
Milestone: Priority: low
Component: Translate Site & Plugins Keywords: dev-feedback has-patch
Cc:

Description

It would be nice if we could search just part of the string (like single words) on the Consistency tool.

Maybe have a "Wildcard" checkbox would be the perfect world.

I was able to see that @casiepa reported about the same thing here as a bug (it would actually be a new feature): https://meta.trac.wordpress.org/ticket/5218

Initial mention: https://meta.trac.wordpress.org/ticket/1682#comment:9

Change History (9)

#1 @felipeloureirosantos
6 weeks ago

  • Component changed from General to Translate Site & Plugins

This ticket was mentioned in PR #12 on WordPress/wordpress.org by dd32.


6 weeks ago

  • Keywords has-patch added; needs-patch removed

Add a Wildcard search to the Translate Wildcard search.

TODO:

  • Decide if it's worth it.
  • Limit it to PTE/GTE only to protect against DOS'ing

https://meta.trac.wordpress.org/ticket/5228

#3 follow-up: @dd32
6 weeks ago

  • Keywords needs-patch added; has-patch removed

Adding a wildcard here isn't out of the question, but it's not super performant.

Due to the number of strings in the DB adding a suffix wildcard (ie. Drop-in%) is about 10x slower, and while it works, it's not going to scale.
We'd have to limit it to logged in users only, and probably best even only to those who have some form of higher permissions than a translation contributor (which shouldn't be a problem).

(partial PR incoming)

#4 @dd32
6 weeks ago

  • Keywords has-patch added; needs-patch removed

#5 in reply to: ↑ 3 @felipeloureirosantos
6 weeks ago

Replying to dd32:

Adding a wildcard here isn't out of the question, but it's not super performant.

Due to the number of strings in the DB adding a suffix wildcard (ie. Drop-in%) is about 10x slower, and while it works, it's not going to scale.

That makes total sense. I expected something like that as it would bring really more strings.

I see that it would make the tool really more useful as it would allow us to have context on the translation and show other possible translations for the same word, but I personally believe that it could be cached for a long period.

Do you believe that cache the results for like 30 days would make sense? Would that help in this situation?

In this case, maybe would be a good idea adding a message there like that: "Results from [DATE]"

We'd have to limit it to logged in users only, and probably best even only to those who have some form of higher permissions than a translation contributor (which shouldn't be a problem).

I totally agree regarding logged in users only, but could we limit the number of wildcard searches that the users could have instead of limiting to PTE/GTE only?

It would be amazing for providing consistency tool links on new contributors' strings reviews, and also for them being able to do that themselves.

#6 @dd32
6 weeks ago

Do you believe that cache the results for like 30 days would make sense? Would that help in this situation?

I don't believe adding caching here would be beneficial, nor would it resolve my performance concerns

I totally agree regarding logged in users only, but could we limit the number of wildcard searches that the users could have

I don't really like the idea of only accepting a certain number of requests per user..

For providing links to new contributors, you'd still be able to offer them the non-wildcard search which seems like a better option to me.. Would Wildcards really be useful for that? Other than as a shortcut for not having to type the full string?

#7 @felipeloureirosantos
6 weeks ago

I don't really like the idea of only accepting a certain number of requests per user..

I see that. Would limit the search results make sense? Microsoft seems to show the first 1000.

Also, maybe you can get something useful from the way that they deal with that: https://www.microsoft.com/en-us/language/Search?&searchTerm=Test&langID=594&Source=true&productid=0

For providing links to new contributors, you'd still be able to offer them the non-wildcard search which seems like a better option to me.. Would Wildcards really be useful for that? Other than as a shortcut for not having to type the full string?

Yes, it would. Let me provide you an example around that.

If you search for "Props" on the tool (pt_BR), you won't see results: https://translate.wordpress.org/consistency/?search=Props&set=pt-br%2Fdefault&project=

You won't actually find something because it comes just as part of the string.

You can see a string example here: https://translate.wordpress.org/projects/wp-plugins/wordpress-seo/dev-readme/pt-br/default/?filters%5Bstatus%5D=either&filters%5Boriginal_id%5D=9836026&filters%5Btranslation_id%5D=72632568

The above example is not an exception. It would be useful on several similar situations.

#8 @dd32
6 weeks ago

If you search for "Props" on the tool (pt_BR),

In that case, we definitely can't support that kind of searching at all right now.

Wildcard searching based on a prefix (Drop-in wildcard matching Drop-ins <span>..) is possible, but wildcard searching for text anywhere within an original (for example, Props as in your example) is not going to be possible on WordPress.org currently.

A wildcard search for %Props% currently times out after 60s, there's simply too many strings in the GlotPress database for that. It's really not designed for being searched.

No amount of limiting, caching, etc will work around that.

It'd probably require leveraging something like ElasticSearch for that.

#9 @felipeloureirosantos
6 weeks ago

  • Resolution set to maybelater
  • Status changed from new to closed

Sure, I see your point.

Thank you for checking the possibility of that.

I hope that we can bring something like that in the future. :)

Note: See TracTickets for help on using tickets.