WordPress.org

Making WordPress.org

Opened 4 months ago

Closed 4 months ago

Last modified 4 months ago

#5770 closed enhancement (wontfix)

Add the name "fuzzybot" to a translation when it is set to "fuzzy" by the bot

Reported by: psmits1567 Owned by:
Milestone: Priority: normal
Component: Translate Site & Plugins Keywords:
Cc:

Description

Currently it happens that a translation is set to "fuzzy", while it is not obvious why. Also the impression is there that something else sets a record to "fuzzy"
So to narrow down who did set the translation to "fuzzy", it would be helpful if the record that is set to "fuzzy" gets the name "fuzzybot" (or some other obvious name). Currently nothing is noted who/what sets the record to "fuzzy".
While when a PTE or GTE or import sets it to "fuzzy" the name is added to the record that modified the record.

Change History (8)

#1 @ocean90
4 months ago

  • Resolution set to wontfix
  • Status changed from new to closed

Thanks for the ticket but I think there's a misunderstanding because there is no bot.
During the import the original strings are checked for similar existing originals. If there's one which also has translations, the translations are automatically marked as fuzzy. The translations may have been suggested by a user in the past thus they have a user ID assigned. If there's no user it usually means that the translation was imported from other sources via CLI.

#2 follow-up: @psmits1567
4 months ago

I think this is not 100% correct, because if an import is performed the user that did that is added to the strings. So then I would know who imported the strings. Now you do not see what set the state to "fuzzy"
We will investigate this behaviour further,as it is annoying for us PTE/GTE.
Today I got one fuzzy record where no previous translation is present?
Why or how comes this record set to "fuzzy"
There is nothing wrong with the translation

#3 @ocean90
4 months ago

Unfortunately without any examples I can't answer your questions. But an original with a fuzzy translation doesn't need to have any other previous translations if the fuzzy translation was the only one.
You can find the relevant code here.

#4 in reply to: ↑ 2 @dd32
4 months ago

Replying to psmits1567:

because if an import is performed the user that did that is added to the strings. So then I would know who imported the strings. Now you do not see what set the state to "fuzzy"

An example of this would help, I believe the only way that would happen is if one of the following is true:

  • A user uploaded a .po/.mo file which contained strings that were marked as fuzzy. The translator should be set to that of the uploader
  • The string wasn't imported, but was a translation of a now-removed original string and a newly added original is similar enough that the existing translation for the removed original was made a fuzzy translation of the new original, it should have a translator attached.

If there's no translator attached, the only case where that should be happening is if the original translation in question was imported from being bundled with a plugin, back in ~2016 when all plugins were shifted over to translate.wordpress.org. Due to some cross-project translation efforts at one point, it might be that the originals were an exact match between PluginA and PluginB and so it was copied to both.. without a translator attached since it was unknown, as it was imported from SVN.

...I hope some of that makes sense, but tl;dr, an exact example of a string is needed to be able to provide any solid answer, but it's almost certain that if you're not seeing a translator, it's from that SVN import event when plugins were imported to translate.w.org.

#5 @psmits1567
4 months ago

@dd32 @ocean90 both thanks for researching the reason of fuzzies popping up that do not seem to be logic
@erica has setup a spreadsheet where we will put in examples we find.https://docs.google.com/spreadsheets/d/1jiWMZSI-4ARrgL4gRjMxMSVCRahpBPErXu1jT1FIbSw/edit#gid=0

I have added a comment into that sheet with my conclusions. It seems to me that it now is the same conclusion as stated by dd32, the fuzzy might be caused by import.

But if the import has been done in the past, why are they popping up now after a so long time. I think it is caused by recent imports. There are currently two examples present. The first one has been researched by vladt, and was caused by changing one position to a Capital by a new release (without adding the author). The second one might be a good example to research. There is no obvious reason to be "fuzzy". It is also without author.

#6 @dd32
4 months ago

@psmits1567 I think there's definitely a misunderstanding of what a "fuzzy" string is.

Let's say that I have a heading in my plugin v1
<h1><?php _e( 'My super awesome plugin' ); ?></h1>
In the next release of the plugin v2, I change it to:
<h1><?php _e( 'Super Awesome Plugin Settings' ); ?></h1>

If you're translated the first string in v1, it no longer exists in v2, and so your translation will be marked as an old translation, it no longer matches any translation in the current version of the plugin.

Now let's say that in v3 of the same plugin, I change that heading to:
<h1><?php _e( 'Super Awesome plugin settings' ); ?></h1>

That is a string change, the string from v2 no longer exists, and a new v3 string exists.

As part of the "import strings from plugin into GlotPress" process, GlotPress determines fuzzies of strings, this is probably what you're thinking of as "fuzzybot". GlotPress says "Oh these two strings look Verrryyy similar, but they're not the same", and so it copies the translation from the v2 string against the v3 string and marks it as a fuzzy translation, the translation might be correct, but GlotPress doesn't know - it needs human review to ensure that it's still accurate.

If we take the example from the spreadsheet:
In 3.5.0 of the plugin, the string was _e( 'All Categories' ) and it was translated by Chantal.
In the next release, 3.5.1, the capitalisation was changed _e( 'All categories' ). The translated string no longer exists and a new untranslated string exists. GlotPress has realised it's just a minor change to what's probably the same string and has copied it over and retained the original translators attribution.

In this case, yes, the fuzzy translation can now be approved, the capitalisation in the fuzzy translation is correct.

In another thought up example, if a string changed from "Please click here to deactivate your account" to "Please click here to delete your account" GlotPress would probably detect that as a fuzzy (I haven't checked), since the original two strings are very similar, but GlotPress doesn't understand language, it needs a human reviewer to look at it. The reviewer should NOT approve that fuzzy.

There's an argument to be made that a simple capitalisation change should not cause a string to go fuzzy, but for example, %F to %f can be a major change and require a human review.

#7 @dd32
4 months ago

In other words, a fuzzy translation is a suggested translation based on a similar string being removed in the same GlotPress project update operation.

Fuzzy translations only kick in when a string is removed AND a string is added that looks similar.

A Fuzzy translation will not be triggered by only additional strings being added, even if a string looks similar to an existing string, if that existing string still exists it's obviously not the same.

#8 @psmits1567
4 months ago

Thanks for the explanation
I have the feeling I am firefighting with the fuzzy process!
We all thought it was a bot checking translations for changes.
That does not seem to be the case.
So we narrowed it down that a fuzzy record is set under different circumstances. If the change is obvious, than it is clear what needs to be done. But we see a lot of fuzzed records, where we do not see or understand where it is coming from. Your examples are clear, so they won't be a problem. But there are to many that are not so obvious, and we as GTE or PTE do not understand what caused it.
So therefore we are searching to improve it.
One of them is keeping the previous record as "old", so we can see what is changed.
It is a weekly, or almost dayly job to check those fuzzied strings, which takes a lot of time, we could use better to check waiting strings. And contact the translator to help him improving suggestions.

Note: See TracTickets for help on using tickets.