Making WordPress.org

Opened 3 years ago

Last modified 3 years ago

#5671 new enhancement

If an original line does not contain a dot at the end of the line but the translation does, the record will be fuzzed

Reported by: psmits1567's profile psmits1567 Owned by:
Milestone: Priority: normal
Component: Translate Site & Plugins Keywords:
Cc:

Description

I noticed a couple of times that a translation gets fuzzied if the original line does not contain an ending dot, but the translation does have a dot at the end of the line.
In the EU it is standard writing rules that you end a line with a dot.
So if someone adds a dot at the end of the translation, that should not be fuzzied.
If you do not have Glotdict installed, there is no warning that you add or leave a dot out of the translation.
Solutions could be:

  1. When a translator adds a dot while the original does not have it, then remove the dot automatically. Also the other way around, if the original has a dot, but the translator did not add it, then automatically add the dot when saving the translation.
  2. The fuzzy process should not mark this line as "fuzzy", it does no harm leaving the dot in, as this normal writing.

It could also be that the line has been imported, then these corrections should also be made.

Change History (5)

This ticket was mentioned in Slack in #polyglots by psmits1567. View the logs.


3 years ago

#2 @tobifjellner
3 years ago

Any kind of automatic changes applied to fuzzy strings MUST be configurable separately for each locale.

#3 @psmits1567
3 years ago

That would be a great enhancement if we could have a separate list for auto replacement. Not the glossary, because that contains extra information

#4 @dd32
3 years ago

If you do not have Glotdict installed, there is no warning that you add or leave a dot out of the translation.

I had thought there was a warning about changing trailing punctuation, but I can't find one in GlotPress or on WordPress.org, so I must have been thinking of the GlotDict warnings myself.

In the EU it is standard writing rules that you end a line with a dot.

Yes, that's a standard writing rule in most "western" languages that I'm aware of, but I'm not sure we can automatically apply trailing punctuation matching between original and translation for submitted translations.. Some locales do strip punctuation (eastern languages IIRC) so it's not a blanket rule.
I would probably consider it a translation error if a submitted translation adds punctuation unexpectedly.

For updated originals that result in a translation going fuzzy, where the only difference between the originals is a trailing punctuation change, maybe we can automatically suggest a new pending string with the matching punctuation..

For example to make it clear to others (Google Translate, I apologise to french speakers)

Original: This is a string
Translation: Ceci est une chaîne

New Original: This is a string.
Now Fuzzy Translation: Ceci est une chaîne
Automatically added Pending Translation: Ceci est une chaîne.

I'm not sure I'd want it to say "It's only varying by trailing punctuation, don't bother fuzzying it" or "It's only a difference of trailing punctuation, adjust and approve".. but the above seems like something more acceptable.

Reading between the lines (so to speak) I think what is being said here is that there's a tooling issue around fuzzies, it's hard to quickly go through and approve fuzzies as "near enough" or to easily approve pending replacements for them (be they human submitted or automatically generated)

#5 @Nao
3 years ago

For now, I prefer the auto-suggesting (not auto-saving) replacement route – as long as they are suitable for all locales.

These are different variations of the period symbol that I know of.

But in Japanese, we use the English period when the string is not translated (e.g. "All Rights Reserved."). I can't confirm but I think this is the same for others. Happy to do more research once we decide on the direction!

Also, 3-dots (...) at the end of a sentence tend to stay as-is for many locales (Japanese uses a native character for it, which is a single letter: ). There may be other exceptions like this.

Version 0, edited 3 years ago by Nao (next)
Note: See TracTickets for help on using tickets.