WordPress.org

Making WordPress.org

Opened 7 weeks ago

Closed 3 weeks ago

Last modified 2 weeks ago

#5154 closed defect (fixed)

Automatically fix some translation errors

Reported by: dd32 Owned by:
Milestone: Priority: normal
Component: Translate Site & Plugins Keywords:
Cc:

Description

Split off from #5152

There are some common translation warnings which can be automatically corrected, simplifying the translation process by reducing the amount of time a translator (and editor) have to spend checking translations.

The two main ones I see right now that can be automatically fixed are:

  • Newlines, either prefixed or suffixed to originals/translations
  • Unicode percent signs being used rather than ASCII percent signs in placeholders

Are there any others? I'm hesitant to fix mangled HTML tags, as although most are usually just an extra space around a < or </ it's a good sign of a machine translation that's usually not perfect.

Change History (11)

#1 @dd32
7 weeks ago

In 9741:

Translate: Add a plugin to automatically fix some common translation errors.

If it can't fix all of a translations errors, it doesn't alter the submitted translation.

See #5154.

#2 @tobifjellner
7 weeks ago

I agree that automatically fixing signs of machine translation should be avoided.
As a vector for the future, perhaps some automatic fixes could be offered for specific locales, but that would probably become a project on its own.

This ticket was mentioned in Slack in #polyglots by casiepa. View the logs.


7 weeks ago

#4 @Mte90
6 weeks ago

Looking at experience with other tools like Transifex, PoEdit, Lokalize and I think Crodin I think that is better to suggestions that fix automatically for you.
Like an opt-in way for any sentence.
Example: the sentence is missing a final dot, would you want to add it? Yes/No and automatically does for you.
This can be helpful for sanitization and let to create custom warnings for specific locales like to use different unicode symbols and so on.

#5 @dd32
6 weeks ago

I agree that not everything should be fixed automatically, this was a case where the fixes were obvious and "always" right and has significantly reduced the amount of warnings (IMHO)

Some warnings/fixes could be applied prior to the submission of the string, highlighting missing placeholders or tags prior to submission, etc. and offering automatic fixes there would make sense for things like HTML tags or highlighting the original/translation additions/deletions.

This ticket was mentioned in Slack in #polyglots by nao. View the logs.


6 weeks ago

#7 @dd32
6 weeks ago

In 9766:

Translate: The 'warnings' property isn't always set, since a translation might be being updated by the importer which does it in stages.

Amends r9741.
See #5154.

#8 @dd32
5 weeks ago

In 9801:

Translate: Sometimes the warnings key is set to a non-empty value that isn't an array.

See #5154.

#9 @dd32
3 weeks ago

In monitoring the #polyglots-warnings channel, the only other things I've seen that would be reasonable to autocorrect are:

  • non-ASCII $ in printf placeholders, there's a few various other unicode variants of the dollar sign
  • non-ASCII characters used in printf placeholders, such as a unicode S variant

Those seem to happen very rarely, so I'm going to skip adding anything for those and close this ticket as fixed for now.

There's another ticket to add some JS-based warnings pre-submit as well, which will hopefully remove the need for this in the first place and/or support auto-fixing some warnings.

If the warning logging that will hopefully be added as part of #5152 reveals anything major, we can re-open or create a new ticket.

#10 @dd32
3 weeks ago

  • Resolution set to fixed
  • Status changed from new to closed

#11 @ocean90
2 weeks ago

In 9889:

Translate: Add missing static keyword to avoid a deprecation notice.

See #5154.

Note: See TracTickets for help on using tickets.